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abstract 


^  NL2S0L  is  a  modular  program  for  solving  nonlinear  least-squares  problems 
that  incorporates  a  number  of  novel  features.  It  maintains  a  secant  approxi¬ 
mation  S  to  the  second-order  part  of  the  least-squares  Hessian  and  adaptively 
decides  when  to  use  this  approximation.  S  is  "sized"  before  updating,  some¬ 
thing  which  is  similar  to  Oren-Luenberger  scaling.  The  step  choice  algorithm 
is  based  on  minimizing  a  local  quadratic  model  of  the  sum  of  squares  function  ^ 
constrained  to  an  elliptical  trust  region  centered  at  the  current  approximate 
minimizer.  This  is  accomplished  using  ideas  discussed  by  Mor^,  together  with 
a  special  module  for  assessing  the  quality  of  the  step  thus  computed.  These 
and  other  ideas  behind  NL2S0L  are  discussed  and  its  evolution  and  current  im¬ 
plementation  are  also  described  briefly. ^ 
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SIGNIFICANCE  AND  EXPLANATION 


Mathematical  models  are  frequently  used  in  the  physical  and  social  sciences. 
Such  models  usually  contain  parameters  that  may  be  chosen  to  make  the  model  “fit" 
some  given  data  as  well  as  possible  in  some  specified  sense.  One  sense  often 
specified  is  that  of  minimizing  the  sum  of  the  squares  of  the  errors  that  the 
model  makes  on  the  given  data,  and  when  some  of  the  parameters  appear  nonlinearly 
in  the  model,  determining  the  model  parameters  requires  solving  a  nonlinear  least- 
squares  problem.  In  various  contexts,  such  as  when  the  data  contain  large  measure¬ 
ment  or  transcription  errors,  one  nay  wish  to  solve  a  nonlinear  least  squares 
problem  in  which  the  model  errors  at  the  optimal  parameters  are  large  enough  that 
conventional  nonlinear  least-squares  algorithms,  such  as  the  Gauss-Newton  or 
Levenberg-Marquardt  methods,  perform  poorly.  The  present  work  describes  an 
approach  that  usually  gives  good  performance  whether  or  not  the  model  errors  are 
large.  In  part  this  paper  describes  the  computer  ..  ide  NL2S0L,  which  implements 
the  ideas  presented  here.  NL2S0L  also  embodies  many  of  the  ideas  presented  by 
John  Dennis  and  Robert  Schnabel  in  the  short  course  that  they  gave  in  May,  1979 
at  the  University  of  ’’i scon sin-Madi son  under  sponsorship  of  the  Mathematics 
Research  Center. 


The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive  summary 
lies  v;ith  MP.C,  and  not  with  the  authors  of  this  report. 
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AN  ADAPTIVE  KOriLINEAR  LE/3T-SQUAKES  ALGORITHM 

>y  . 

J.E.  Dennis,  Jr.,  Rivid  M.  Gay,  Roy  E.  Welsch 

1.  Introduction. 

This  project  'began  in  order  to  neet  a  need  for  a  nonlinear  least-squares 
algorithn  which,  in  the  large  residual  case,  would  be  core  reliable  than  the 
Causs-Newton  or  Levenberg-1-fe.rquardt  aethod  [Dennis,  1977]  and  nore  efficient 
than  the  secant  or  variable  cetric  algorithus  [Dennis  &  ftorfi,  1977]»such  as 
the  Dnvidon-Fletcher-Powell  method,  which  are  intended  for  general  function 
ainiaization. 

We  have  developed  a  satisfactory  computer  program  called  NL2S0L  based 
on  ideas  of  Dennis  and  Welsch  [1978]  and  our  prirary  purpose  here  is  to  report 
the  details  end  to  give  some  test  results.  On  the  other  hand,  we  learned  so 
9uch  during  the  development  which  seems  likely  to  be  applicable  in  the 
development  of  other  algorithms  that  we  have  chosen  to  expend  our  exposition 
to  include  some  of  this  experience. 

In  section  2  ve  set  out  the  problem  and  the  notation  ve  intend  to  use. 

Section  3  deals  with  our  way  of  supplementing  the  classical  Gauss-IIewton 

approximation  to  the  least-squares  Hessian  by  various  analogs  of  the  Davidon- 

flet Cher- Powell  method.  Section  briefly  describes  our  interpretation  of 

the  Oren-Luenburger  [Oren,  1973]  sizing  strategy  for  this  augmentation.  In 

.;tlon  5  ve  describe  our  adaptive  quadratic  modeling  of  the  objective  function. 

Section  6  contains  a  discussion  of  the  stopping  criteria  and  covariance  matrices 

and  section  7  contains  test  results.  The  NL2S0L  Usage  Sumaiy  is  Included  as 
»• 

ah  appendix.  •  _ 

This  work  was  supported  in  part  by  the  National  Science  Foundation  Grants  DCR75-1014 
MCS76-00324,  and  SOC76-14311  to  the  National  Bureau  of  Economic  Research,  Inc.,  and 
was  sponsored  in  part  by  the  United  States  Army  under  Contract  No.  DAAG29-75-C-002-^ . 
This  work  is  based  upon  work  supported  by  the  National  Science  Foundation  under 
Grant  No.  MCS78-09525. 
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2.  I'hc  Nonlinear  Lccst-gquargs  Protlen. 

There  are  good  reasons  for  numerical  analysts  tp,  study  least-squares 
prohlcr.s.  In  the  first  place,  they  are  a  coeputation  of  prinary  inportance  in 
statistical  data  analysis  and  hence  in  the  social  sciences,  as  veil  as  in 
the  core  traditional  areas  vithin  the  physical  sciences.  Thus  a  computer 
algorithm  able  to  deal  efficiently  vith  both  sorts  of  data  is  widely 
applicable. 

Although  applicability  shovdd  always  constitute  sufficient  Justification 
to  tackle  a  problem,  in  this  ease  there  is  also  an  opportunity  for  more  far 
reaching  progress  in  numerical  optimization.  In  order  to  be  more  specific, 
it  will  be  useful  to  have  a  formal  statement  of  the  nonlinear  least-squares 
problem. 

We  adopt  notation  consistent  vith  fitting  a  model  to  n  pieces  of 
data  using  p  parameters:  Given  R:  -►  R°,  we  wish  to  solve  the  unconstrained 

minimization  problem 

(2.1)  min  fix)  =  r(x)^R(x)  =  h  I  r  (x)^. 

i=l  ^ 

Notice  that  if  j{x)  *  R*(x)  =  id^r^ix)) ,  then  the  gradient  of  f  is 

(2.2)  7f(x)  =  J(x)’’r(x) 
and  the  Hessian  of  f  is 

(2.3)  V^f(x)  =  J(x)^J(x)  +  t  rj^(x)V^rj^(x). 

iel 

Since  ve  ere  seeking  a  minimum  of  f,  we  wish  to  have  f(x*)  =  0,  an 
obviously  global  minimum;  in  the  more  realistic  case  where  f  is  not  anywhere 
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Dear  zero,  ve  vill  bo  forced  to  terainatc  on  small  parameter  changes  or  to 
use  some  other  convergence  criteria  (see  Section  6).  It  is  clear  from  (2.2) 
that  Vf(x*)  =  p  and  R(x*)  ^  0  ‘corresponds  to  R(x*)  C(j(x*)),  the 

column  space  of  j(x*).  Thus  it  is  essential  as  the  iteration  proceeds  that 
C(j(xj^))  be  approximated  very  veil  in  the  usual  case  vhere  p  <  n  and 
R(x«)  4  0. 

In  addition  to  caking  a  precise  convergence  test  possible,  having  an 

accurate  Jacobian  catrix  means  that  a  good  approximation  to  a  portion  of 

the  Hessian  is  available  as  a  byproduct  of  the  gradient  computation.  In  fact, 

2 

it  is  often  possible  to  ignore  the  second  order  term  rrj^(x)v  r^(x)  of  the 
Hessian  altogether  on  the  grounds  that  if  the  nonzero,  residuals  are  not  of 
a  sort  that  reinforce  their  nonlinearity,  j(x)^j(x)  is  a  sufficiently  good 
Hessian  approximation  [Wedin,  1972,  197^a-c] , [Dennis,  1977].  In  the  resulting 
Gauss-Nevton  method,  the  "Newton  step"  from  x^  is  defined  by  the  linear 
system  of  equations 

(2.1»)  j(xj.j’’j(xj^)sj^  =  .:j(Xj^)'^R(Xj^). 

Since  (2.4)  is  the  system  of  normal  equations  for  the  linear  least- 
squares  problem 

(2.5)  min  ?s(j(xj^)s  +  R(Xj^))'*^( j(Xj^)s  +  R(Xj|.)), 

s 

it  is  better  to  obtain  Sj^  from  a  QR  decomposition  of  J(xj^)  (see  [Golub,  1969]). 

Ve  can  view  (2.5)  as  defining  a  quadratic  model  in  x  *  ^  ♦  s  of 
the  least-squares  criterion  function  (2.1); 


(2.6) 


-U 

q°(x)  =  H 

+  Js(x-Xj^)'^j(Xj^)V(Xj^)(x-Xj^). 

Fron  (2.1),  (2.2),  (2.3)  ve  see  that  the  difference  between  this  Gauss- 
Kevton  nodel  and  the  usual  llew-ton  rodel  obtained  fron  a  quadratic  Taylor 
expansion  aroimd  Xj^  is  Just  the  term  ?5(x-Xj^)  [rr^(xj^)v  r^(x^)  ]  (x-Xj^) . 

The  conceptual  difference  between  these  two  models  is  interesting  in 
that  it  exposes  some  reasons  for  the  deficiencies  of  the  Gauss-Newton 
algorithm.  The  Newton  model  is  based  on  the  assumption  that  f  can  be 
adequately  modeled  by  a  quadratic,  while  the  Gauss-Newton  model  (2.6)  is 
shown  by  (2.5)  to  resxilt  from  the  stronger  assxm^tion  that  R  can  be 
adequately  modeled  by  an  affine  function. 

3.  An  Augnentaticn  of  the  Gauss-Newton  Hessian. 

■  Our  purpose  in  this  section  is  to  suggest  a  way  to  augment  the  Gauss- 
Kevton  model  (2.6)  by  adding  an  approximation  to  the  difference  betw’een  it 
and  the  quadratic  Taylor  expansion  to  obtain 

• 

(3.1)  q^(x)  =  iiR(x^)'^R(Xj^)  +  (x->^)^j(Xj^)'^R(x^) 

♦  I5(x-Xj^)^[j(xj^)'^j(x^)  +  Sj^](x-Xj^). 

We  will  suggest  an  approximation  rule  for  Sj^  which  is  simple,  general 
and  geometric.  The  approach  is  to  decide  on  a  set  of  desirable  characteristics 
for  the  epproximant  and  then  to  select  to  be  the  nearest  such  feasible 

point  to  S^.  The  rationale  is  that  every  point  in  the  feasible  set  incorporates 
equally  well  the  new  information  gained  at  and  that  taking  the  nearest 
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polnt  (in  a  sense  to  be  explained  later)  corresponds  to  destroying  as  little 
of  the  information  stored  in  Sj^  as  possible. 

Currently  we  begin  with  =  0,  since  this  is  both  cheap  and  reasonable  in 
the  sense  that  =  q..  Suppose  S.  is  available.  First  let  us  decide  on  the  properties 

V  V/  Jv 

2 

should  have.  Remember  that  it  is  to  approximate  ^i^^+1^ 

it  should  obviously  be  symmetric.  It  is  easy  to  find  examples  where  the  tern 
to  be  approximated  is  indefinite,  so  we  reject  any  restriction  on  the  eigen¬ 
values  of  Sj^+2.*  finally,  ve  want  to  incorporate  the  new  information  about 
the  problem,  \+l’  ^k+1'  standard  way  to  do  this  is  to 

ask  the  second  order  approximant  to  transform  the  current  x-change  into  the 
observed  first  order  change,  i.e.. 


(3.2) 


®k+i^  =  ^i^Vi^^ 


^k+l\+l  ■  ^k\+l  =  ^k* 


It  is  perhaps  worth  noting  in  passing  that  we  tested  several  choices  for 
Including  the  Broyden- Dennis  [Dennis,  1973]  choice  ~  ^k+l'^k+l‘ 

and  the  Betts  [1976]  choice  Happily,  (3.2),  which 

makes  more  use  of  the  structure  of  the  problem,  was  the  slight  but  clear  winner. 

In  summary,  ve  choose  Sq  =  0,  c  =  {S;  S  *»  S  and  S&x^  =  y^^l. 

Our  choice  of  from  J  is  made  in  analogy  with  the  DFP  method  for 

unconstrained  minimization  [Dennis  &  More,  1977].  Before  giving  the  formula  and 
its  properties,  ve  reviev  some  useful  notation. 

If  A  is  any  real  matrix,  then  the  Frobenius  norm  of  A  is 

If  ®  Is  symmetric  positive  definite  matrix,  then 
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B  has  a  cynr.ietric,  positive  definite  square  root,  B  .  Define 
-V-  Jit: 

1 1^1  If  B  “  1 1 Ip*  veighted  Frobenius  norm  is  a  natural 

analog  of  the  Frobenius  norm  for  eC  matrix  vhen  the  standard  inner  product 
norm  on  the  domain  is  replaced  by  ll^llg  ~  •  The  following  theorem 

gives  the  update  formulas  as  veil  as  their  defining  properties.  It  is 
Just  a  restatement  of  Theorem  7*3  of  [Dennis  and  Mord,  1977]. 

THEOKSM  3.1:  Let  v'^AXj^  >  0.  Then  for  any  positive  definite  symmetric 
matrix  H  for  which  =  v, 

l|s  -  \IIf.H  ^ 


u-y 

T  T 

In  NIi2S0L  we  compute  corresponding  to  ▼  =  4g^  »  "^k+l^+l  ” 

This  corresponds  to  weighting  the  change  by  any  positive  definite  symmetric 
matrix  that  sends  to  Thus  we  hope  the  metric  being  used  is  not 

too  different  from  that  induced  by  the  natural  scaling  of  the  problem. 
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1».  Slzlnr:  the  Hessir.n  Au^.r.entation. 

It  is  veil  known  ty  nov  that  the  update  nethods  do  not  generate 
approximations  that  hecome  arhitrarily  accxirate  as  the  Iteration  proceeds. 

On  the  other  hand,  ve  know  that  for  zero  residual  problems,  shoiild 

ideally  converge  to  zero  and  that  if  it  does  not  at  least  become  small  in 
those  cases,  then  the  augmented  model  (3.1)  cannot  hope  to  compete  with  (2.6), 
the  Gauss-Newton  model. 

The  crux  of  the  problem  can  be  seen  by  observing  that  even  if 

happened  to  be  zero  and  even  if  y,  defined  by  (3.2)  were  vised  to  make  the 

& 

f 

update  to  Sj^,  then  Sj^+^AXj^  "  would  be  the  same  as 

on  the  orthogonal  complement  of  {Ax^,v}. 

Ve  use  a  straight foinraird  modification  of  the  Oren-Luenburger  self  scaling 
technique  [Oren,  1973].  The  idea  is  to  update  rather  than  Sj^,  to  get 

\+l‘  The  scalar  is  chosen  to  try  to  shift  the  spectrum  of  in  hopes 
that  the  spectrum  of  will  overlap  that  of  the  second  order  term  we  are 

approximating.  Ve  could  take  the  scalar  to  be 


.  T 

T 

Ax^AXj^  J 

L  T 

We  prefer  to  call  this  sizing,  and  since  we  are  primarily  concerned  with 
being  too  large,  we  actually  take 


(»*.l) 
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Vhatever  this  strategy  is  called,  notice  that  vhen  =  0,  our 

y.  *=  0.  and  so  =  0  and  S,  ,  =  0.  The  use  of  sizing  factor  i^.l)  made 
■'k  k  k+1 

a  significant  difference  in  the  perforniance  of  the  algorithm.  See  Table  IV. 

5.  Adaptive  Quadratic  Modeling. 

In  section  3  ve  noted  that  =  0,  vhich  means  that  the  augmented  model 

(3.1)  is  initially  eq.ual  to  the  Gauss-Nevton  model  (2.6).  Tests  have  shown 

G  S 

that  often  predicts  better  than  small  k, 

so  it  seems  useful  to  have  some  vay  to  decide  vhich  model  to  use  to  determine 

the  step. 

Betts  [1976]  also  starts  with  =  0  and  takes  Gauss-Nevton  steps 
for  at  least  p  iterations  and  until  Ax  is  small  enough  to  make  it 

A 

likely  that  is  near  x*.  It  seems  therefore  as  though  his  aim  is 

to  make  a  last  few  refining  iterations  based  on  the  augmented  Hessian.  The 
heuristic  ve  use  in  NL2S0L  usually  uses  the  augmented  Hessian  much  sooner. 

NL2S0L  uses  a  model/trust  region  strategy  to  pick  A*^«  The  step  is 
of  the  form 

(5.1) 


where  is  the  current  Hessian  approximation,  is  a  diagonal  scaling 

matrix  and  >.  0  is  chosen  by  the  safeguarded  Reinsch  iteration  as  in 
(Hor#,  1978],  with  the  case  of  near  singularity  in  handled  as  in 

[Cay,  1979].  The  important  thing  is  the  idea  of  having  at  Xj,  a  local  quadratic 
model  qj^  of  f  and  an  estimate  of  a  region  in  which  q^^  is  trusted  to 
represent  f.  The  next  point  is  chosen  to  approximately  minimize  q^^ 
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In  this  region  or  to  Elnicize  in  an  approximation  to  this  region.  In 

either  case,  the  information  gained  about  f  at  is  then  used  to 

update  the  model  and  also  to  update  the  size  or  shape  of  the  trust  region. 

We  "begin  vith  the  assumption  that  holds  globally.  Since  the 

trust  region  revision  is  always  based  on  the  length  of  the  step  Just  taken, 
this  causes  the  radius  to  be  set  automatically  by  the  initial  Gauss-Newton 
step.  This  scheme  often  works  well,  but  it  can  have  problems.  If  the  Gauss- 
Newton  step  is  too  long,  the  trust  region  may  have  to  be  shrunk  repeatedly  vith 
attendant  evaluations  of  the  residual  function  R  to  obtain  an  acceptable  x^. 
Much  more  serious  is  the  possibility  of  overflow.  The  initial  assirrotion  of 
global  linearity  can  be  overruled  by  assigning  a  small  value  to  "VClMAXO),  the 
maximum  length  allowed  for  the  very  first  step  attempted. 


Figure  1  will  perhaps  be  helpful  at  this  point.  The  ellipses  represent 
the  contours  of  and  the  circle  is  the  trust  region  —  our  picture  assumes 

the  diagonal  scaling  matrix  to  he  the  identity.  The  point  is  the  "Newton 

step"  or  global  minimum  of  the  convex  quadratic  model  qj^,  and  the  curve  s(r) 
represents  the  locus  of  minimizers  of  "*■  constrained  by 

llsjlg  ^  r,  0  <  r  <  •».  Complete  details,  based  largely  on  [More,  1978],  can  be 
found  in  [Gay,  1979],  but  we  choose  =  s(r)  so  that  between 

0.9  and  1.1  of  the  current  trust  radius.  (The  actual  choice  of  is  discussed 

In  Section  7.)  ■ 

Since  ve  were  using  this  adaptive  approach,  it  is  not  surprising  that 


ve‘ also  thought  of  using  the  new  information  at 


Vl 


to  select  between 


Vi  Vi  for  use  in  determining  ^^^2*  decision  rule  is  rather 


stralghtforvard.  Since  «=  0,  vc  begin  using  the  Gauss-Nevton  model. 
After  caking  a  prospective  step  based' on  the  currently  preferred  model 

1  1  1  i 

to  obtain,  say,  compute  and  If  ^  fj^ 

1  2  1 
Is  discarded,  but  first  the  other  node!  is  evaluated  at 

to  see  hov  well  it  agrees  with  If  there  is  not  sufficient  agreement 


between  f 


k+1  ^  (i-e.,  if 

"  ^k+1^  "  ^k+ll^» 


(5.2) 


then  we  keep  the  original  codel  preference,  shrink  the  trust  region,  and 

try  again.  We  shrink  the  trust  region  radius  by  the  factor  suggested  by 

Fletcher  [1971]  and  described  by  tore  [1978,  p.  109).  If  the  agreement 
1  2  1 

between  f^^^^  and  <ljj(^+l)  sufficiently  good,  then  we  change  our 

2  2 

model  preference  to  q  and  coepute  x  using  the  same  trust  region.  If 
o 

is  unacceptable,  then  the  trust  region  is  shrunk  and  we  repeat  the 
above  process  on  the  smaller  trust  region  with  whichever  model  gave  the 
least  function  value,  but  now  we  no  longer  consider  changing  models  while 
continuing  to  seek  an  acceptable 

If,  say  yiel<is  an  acceptable  function  decrease  but 

(5.3)  "  ^^*k^  1  oiax{lO.[q^(xJ^^)  -  f(Xjj)l. 

and  Axj^  =  Vi  -  ^  VBs  computed  by  (5.1)  with  >  0,  then  we  deem  it 
worthwhile  to  try  recomputing  with  a  larger  trust  region  radius 

before  accepting  the  step.  Eence  we  double  the  radius  and  obtain,  say, 

1*  1*  1  1*  1 

^  replaces  and  we  again  check 

1 1 

whether  to  double  the  radius.  Otherwise  ve  discard  **>*1  accept 

Vr 
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Vhch  an  acceptable  found, 

compared  to  f  ...  We  have  found  that  it  is  best  to  retain  the  currently 
KtX  , 

preferred  model  if  (5.2)  holds  vith  =  *k+l’  other 

model  does  a  significantly  better  Job  of  predicting  the  new  function  value. 

Once  bns  been  found,  we  decide  what  trust  region  radius  to  use 

first  when  seeking  radius  chosen  has  the  form  *^11^1^+1^^112* 

where  -  x^.  If  f(xj^+j^)  -  f(x^)  i  ^ 

is  Fletcher's  [1971]  decrease  factor;  if  either  (5.3)  holds  with  =  *k+l’ 

or  -[Vf(xj^^^)  -  Vf(xj^)]ll2  l0.5|lvf(xj^^^)||,  or 

T  T 

75’ Ax^Vf (x^) ,  then  y  »  2;  otherwise  y  »  1.  This  rule  for 
updating  the  radius  is  a  modification  of  one  described  by  Powell  [1970]. 

6.  Convergence  Criteria  and  Covariance. 

An  important,  sometimes  difficult  issue  in  practical  computing  is  the 
matter  of  deciding  when  to  stop  an  iterative  procedure.  We  have  chosen  to 
include  four  convergence  criteria  in  NLPSOL:  tests  for  "cosine  convergence", 
"variability  convergence",  "residual  convergence",  and  "X-convergence". 

At  any  critical  point  of  the  sum  of  squares  function  (2.1),  such  as  the 
desired  ninimizer  x",  the  residual  vector  R  is  orthogonal  to  all  columns 
of  the  Jacobian  matrix  J.  ^foreover,  the  angles  between  B  and  the  columns 
of  J  are  independent  of  the  scale  of  R  and  colximns  of  J,  so  it  is  reasonable 

•tp  use  a  test  based  on  these  angles  [Dennis,  1977].  Hence,  we  define 

0  if  R(x)  *  0 

»M<|  J.  ^i(x)’'r(x)  |/(  1 1  J.  _,(x)  1 1,1  |R(x)  1 1,) 

»  ^ji-  1  i  *  i 

Otherwise, 


(6.1)  COSMAX(x) 
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vhere  J.  .(x)  is  the  1— colvcin  of  J(x),  and  ve  detect  cosine  convergence 
at  any  iterate  Xj^  for  vhich  COSKAX(Xj^)  is  less  than  or  equal  to  the  cosine 
convergence  tolerance  V(CCONCR).  By  default,  =  10  ,  so  the  COSMAX  is 

scale  invariant  over  a  wide  range  of  probleas. 

For  statistical  data  analysis  a  different  type  of  convergence  criterion 
is  often  appropriate.  Since  there  is  inherent  variability  in  the  data,  it  is 
generally  not  useful  to  continue  iterating  when  a  candidate  step  s=(s^». • •  ,s^)=  Ax^ 
is  generated  for  which 

(6.2)  max  {|s^|/s.e. (x^)} 

is  sufficiently  small.  Here  s.e.(x^)  denotes  some  estimate  of  the  standard 

error  (sqiiare  root  of  the  variance)  of  the  illl  component  of  the  current  parameter 

estimate  x^  and  so  is  a  function  of  the  statistical  variability  in  the  data. 

An  alternative  to  (6.2)  suggested  by  Pratt  [1977]  is  to  consider  general 
T 

linear  combinations  i  s  of  the  components  of  s,  i.e. 

(6.3)  Bax{|l’'s|/(i’’Vj.i)’®:  0}  =  (s^V‘^s)\ 

where  is  a  current  estimate  of  the  covariance  matrix.  .  For  s.e.  (xj^) 

■»  ''here  e^^  is  the  i^  standard  vinit  vector,  (6.3)  clearly  dominates 

(6.2),  so  we  have  chosen  to  include  a  test  based  on  (6.3)> 

*2  —1  *2 

Our  choice  for  7^  was  *  where  Oj^  is  the  current  residtial  sum 

of  squares  divided  by  max{l,n-p},  i.e. 

(6.>»)  *  2f(x^)/max{l,n-p}, 

and  is  the  current  Hessian  approximation,  i.e.  J^J(x^)  for  the  Gauss- 
Revton  model  and  J^J(:^)  ♦  for  the  augmented  model.  Whenever  a  candidate 
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stcp  6  Is  generated  for  vhich  is  positive  definite,  we  thus  compute 


(6.5) 


VARIAbCs)  =  s'^F^S/a^, 


and  ve  detect  variability  convergence  if  VARIAB(s)  does  not  exceed  the 
variability  convergence  tolerance  V(VCOKCR). 

For  full  Newton  steps,  i.e.  s  =  =-  Hj^^Vf(xj^),  (6.5)  gives  a  quantity 

closely  related  to  the  relative  reduction,  PRSDj^  ,  in  ^'(^)  that  is  still 
possible  according  to  the  current  model.  Specifically,  (6.5)  and  (6.1*)  imply 

PREDj^  =  [-Vf(Xj^)  s  -  ^  Hj^s  J/f{Xj^) 

1  n'*’„  N,-,  > 

'  ^  V 


VARIAB(s")/max{l ,n-p}. 


Thus  at  least  for  full  Newton  steps  (the  steps  usually  taken  near  x*) ,  the 

variability  convergence  test  checks  w'hether  the  predicted  relative  reduction 

still  possible  in  the  residual  stim  of  squares  is  small. 

^  •* 

Zero-residual  problems,  those  for  which  R(x*)  =0,  require  special 

consideration.  Indeed,  it  can  be  shown  that  if  J(x*)  is  nonsingular,  then 

li’m  inf  C0SM‘lX(x)  >  l/[p^cond( j(x*) ) ]  >  0,  where  cond(j)  is  the  condition 

IIx-x*I|-*0  .• 

number  of  J  (ratio  of  largest  to  smallest  singular  value).  Moreover,  for 

s  «  s(x)  the  Newton  step  from  the  Gauss-Newton  model,  i.e.  s(x)  =  7(j^j(x))  ^J^R(x)  , 

It  is  easily  seen  that  lim  VARIAB(s(x))  =  max{l,n-p}.  To  handle  this  case, 

x^  X* 

NL2S0L  detects  residual  convergence  if  (|r(xj^)([  ^  V(RCONCR).  We  regret  that 
this  convergence  test  must  be  sensitive  to  the  scaling  of  R. 


rr 


* 

I 
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It  Is  easy  to  specify  convergence  tolerances  too  strict  for  the  precision 
of  the  arithnetic  being  used.  Ve  have  therefore  included  a  fourth  convergence 
test,  the  X-convergence  test,  which  often  works  when  overly  tight  tolerances 
have  been  given  for  the  other  tests.  This  test  is  satisfied  whenever  a  step 
8  Is  generated  that  yields  a  Euch  smaller  function  decrease  than  expected 
(i.e.  f(xj^+s)  s^Vf (Xj^)/10)  and  the  relative  change  that  s  causes 

In  is  small,  i.e.  RELDX(xj^,Xj^+s)  ^V(XCONCR),  where 

RELDX(y,z)  =  max  (/[  |  +  IXiU* 

/ 

>bny  statistical  inference  procedures  req.uire  an  estimate  of  the  covariance 
matrix  at  the  solution  x*.  NL2S0L  provides  three  possibilities; 


(6.6) 

(6.7) 

(6.8) 


0  fi  «J  4J  B 


52  H-" 


where  a  is  given  by  (6.1j)  with  Xj^  =  x*.  When  (6.6)  or  (6.7)  is  specified, 
a  symmetric  finite  difference  Bessian  approximation  H  is  obtained  at  the 
solution,  X*.  If  H  is  positive  definite  [or  J  is  non-singular  at  x* 
for  (6.8)1,  the  specified  covariance  matrix  is  computed. 

A  detailed  discussion  of  all  three  covariance  forms  is  contained  in  [Bard_,197^] ■ 
The  second  form  (6.7)  is  based  on  asymptotic  maximum  likelihood  theory  and  is  perhaps 
the  most  common  form  of  estimated  covariance  matrix.  Ve  feel  that  (6.6),  the 
default,  is  more  useful  for  smaller  sample  sizes  and  in  other  cases  where  the 
conditions  necessary  [Rso,  1965]  for  the  asymptotic  theory  nay  he  violated.  The 
third  form  assumes  that  the  residuals  at  the  solution  are  small  and  is  therefore 
often  highly  suspect. 


b  » 
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^ •  Tcf.l 

We  have  run  I.’S2S0L  on  a  nimber  of  the  test  problens  reported  in  the 
literature.  In  particular,  ve  have  run  it  on  the  test  problens  listed  in 
[Gill  &  ^!u^ray,  197^»]  and  on  one  described  in  [Meyer,  1970].  The  original 
sources  for  these  problens,  together  vith  the  abbreviated  problem  names  used 
in  Tables  II-IV  and  some  notes,  are  given  in  Table  I. 

Table  I 


Original  Sources  of  Test  Problems 


Problem 

Note 

Source 

ROSiraROK 

[Rosenbrock,  i960] 

HELIX 

1 

[Fletcher  &  Powell,  1963 ] 

SINGULAR 

[Powell,  1962] 

WOODS 

[Colville,  1968] 

zai;gwii,l 

ENGVALL 

2 

[Zangvill,  1967] 

[Engvall,  1966] 

BRANIN 

[Branin,  1971] 

BEALE 

[Beale,  1958] 

CRAGG 

3 

[Gill  It  EJurray,  1976] 

BOX 

[Box,  1966] 

DAVIBONl 

It 

[Davidon,  1976] 

FRDSTEIN 

5 

[Freudenstein  &  Roth,  1963] 

VATS0K6,9,12,20 

[Kovalik  &  Osborne,  1968] 

chebqdS 

[Fletcher,  I965] 

BROWN 

6 

[Brown  &  Dennis,  1971] 

BARD 

[Bard,  1970] 

JENNRICK 

[Jennrich  4  Sampson,  1968] 

KOWALIK 

[Kovalik  4  Osborne,  1968] 

0SB0RNE1,2 

[Osborne,  1972] 

METER 

[Meyer,  1970] 

Notes  on  Table  I 

Note  1;  The  residual  vector  R(x)  for  this  problem  is  a  discontinuous 
function  of  x.  On  those  r\ins  where  NL2S0L  halts  vith  X- convergence,  the 
iterates  have  converged  to  a  point  of  discontinuity. 

Note  2:  This  is  a  linear  least-squares  problem  which  NL2S0L  solves 
in  one  step  when  the  limit  V(LKAXO)  on  the  length  of  the  first  step  is  increased 
slightly  from  its  default  value. 

Note  3:  The  original  Miele  problem  described  in  [Cragg  &  Levy,  19^9],  which 
Gill  and  Mirray  [l97b]  cite  as  the  source  for  this  problem,  does  not  have  the  las 
residual  component  r^Cx)  *>  X|^  -  1.  This  new  component  forces  X|j  to  move  more 

rapidly  towards  1,  but  otherwise  causes  no  noteworthy  change  in  the  performance 
given  by  KL.7S0L. 
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Kotc  l»:  This  is  another  linear  least-squares  prohlcn,  one  that  is  so 
111  conditioned  that  IIL2S0L  needs  tvo  steps  to  solve  it  when  using  double 
precision  on  an  IBM  370  conputer  with  vClIIAXO)  set  large.  With  a  double 
precision  of  a  few  bits  nore  accuracy,  such  as  that  of  the  KULTICS  (i.e. 
Honej'well)  machine  or  the  Univac  1110,  a  single  step  suffices  (for  large 
V(U'AXO)). 

Note  5:  In  all  our  test  runs,  NL2S0L  found  a  local  solution  to  this 
problem.  The  residual  vector  vanishes  at  the  global  solution. 

Note  6:  Gill  and  Murray  [1976]  call  this  problem  "Davidon  2". 

The  behavior  of  NL2S0L  is  determined  in  part  by  an  integer  array  IV  and 
a  floating-point  array  V,  which  contain  iteration  end  function  evaluation  limit, 
convergence  tolerances,  and  other  switches  and  constants.  In  the  runs  summarize 
in  Tables  II-IV,  most  of  the  IV  and  V  input  components  had  the  default  values 
given  them  by  subroutine  DFAULT.  Exceptions  included  the  following;  variabilit 
convergence  testing  was  turned  off  by  setting  V(VCONCR)  =  0;  and  on  problem 
MEYEB,  the  Iteration  and  function-evaluation  limits  were  increased. 

Table  II  summarizes  the  performance  of  NL2S0L  on  the  test  problem  set 
when  all  IV  and  V  input  components  have  their  default  values,  with  the  exception 
Just  noted.  Following  a  suggestion  of  J.J.  Nori  [1979],  we  obtained  new 
starting  guesses  for  many  of  the  test  problems  by  multiplying  the  standard 
starting  guess  by  ten  and  one  hundred.  The  colinnn  labelled  LS  gives  the 
base  10  logarithm  of  the  factor  by  which  the  standard  starting  guess  was 
multiplied.  The  problem  dimensions  appear  in  the  columns  headed  N  and  P, 
while  the  number  of  function  (i.e.  R(x))  and  gradient  (i.e.  J(x))  evaluations 
performed  respectively  appear  under  NF  and  NG.  The  coltnan  labelled  F  gives 
the  final  function  value  (half  the  sum  of  squares  of  R(x)),  while  the  one 
labeled  COSMAX  gives  COSMAX(x),  computed  from  (6.1)  (with  =  10*”^), 
at  the  final  x.  Uhder  C  Is  a  code  telling  why  NL2S0L  stopped:  R  means 
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rcsidual  convergence,  i.e.,  I Ir(x) | 10  C  means  cosine  convergence, 

I.jC.  C0SMAJ((x)  _<  lO”;  X  means  X  convergence  (see  §6)  with 

-13 

V(XCOIICR)  =  2.22  X  10  *,  and  F  means  function  evaluation  limit  reached 

without  convergence.  The  results  reported  in  Tables  II-IV  vere  obtained  on 

the  IBM  370/168  computer  at  the  tossachusetts  Institute  of  Technology,  and 

the  convergence  tolerances  Just  mentioned  are  the  defaults  for  this  machine, 

13  1^ 

which  has  a  \uiit  roundoff  of  I6  =  2.22  x  10  in  double  precision,  the 
precision  used. 

The  choice  of  scale  matrices  mentioned  in  §5  can  significantly 

affect  the  performance  of  NIi2S0L.  By  default,  =  diag(d^,. . .  ,d^)  is 
updated  by  the  rule 

dj  =  max  {[IIj.  -Hg  +  ffiax{0,S^^}  ]^,0.6d^“^ ,10“^} 

at  the  start  of  each  iteration,  beginning  with  d^^  =  0,  where  ^  denotes 
the  column  of  the  current  Jacobian  matrix  J(Xj^).  (The  factor  0.6  is 
actually  V(DFAC).  We  experimented  with  several  values  of  V(DFAC),  including 
zero,  0.5,  0.75,  and  one,  and  ve  felt  that  0.6  gave  the  best  overall  performance 
of  the  values  tried. )  The  advantage  of  this  choice  of  Dj^  is  that  it  is 
largely  scale-invariant. 

A  choice  of  which  is  not  at  all  scale-invariant,  but  which  gives  better 
performance  on  many  of  our  test  problems,  is  ~  identity  matrix.  Table 

III  shows  how  these  two  choices  of  Dj^  compare:  results  from  Table  II  are  repeated 
In  the  columns  headed  ESFAULT,  while  results  corresponding  to  =  I  appear 


under  0^1 
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Table  III  also  gives  results  fron  two  other  test  runs.  Those  under 
V(W!AXO)  =  10«#10  show  what  happens  when  the  bound  V(LMAXO)  on  the  2-nom 
of  the  very  first  step  attenpted  is  increased  from  its  default  value  of  100 

to  10^^,  while  those  under  V(CCOJICR)  =  10*»-8  show  what  happens  when  the 

.  _7  _8 

cosine  convergence  tolerance  is  decreased  from  10  to  10  .  In  both  of 

these  test  runs  the  default  D.  was  used. 


» 

% 

I 
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Tabl e  J 1 

Default  M.2.S0L  Test  Summar>' 


I'ROBLKM 

LS 

N 

P 

NK 

NG 

C 

t' 

COSMAX 

ROSNBROK 

0 

2 

2 

15 

13 

R 

0.973E-32 

0.999E+00 

ROSNBROK 

1 

2 

2 

59 

AA 

R 

0.9;3E-32 

0.999L+00 

ROSNHROK 

2 

2 

156 

135 

R 

0.973E-32 

0.999E+00 

HELIX 

0 

3 

3 

15 

13 

R 

0.323E-18 

0.  lOOE+01 

HELIX 

1 

3 

3 

12 

10 

R 

0.221E-23 

0.  7  77E+00 

HELIX 

2 

3 

3 

125 

37 

X 

0.86AE+0A 

0.7AAE+00 

SINGULAR 

0 

4 

A 

18 

18 

R 

0.  273F,-18 

0.962E-0A 

SINGULAR 

1 

4 

A 

22 

22 

R 

0.R06E-19 

0.551E-0A 

SINGULAR 

O 

A 

A 

30 

26 

R 

0.8A9E-19 

0.522E-0A 

WOODS 

0 

7 

A 

56 

A3 

R 

0. 390E-21 

0. 707E+00 

WOODS 

1 

7 

A 

6A 

4  A 

R 

0- 1 ' 3E-2A 

0. 772E+00 

WOODS 

2 

7 

A 

75 

51 

R 

0.523E-23 

0.721E+00 

X  AN  GW  I  LL 

0 

3 

3 

3 

3 

R 

0. 1A7E-27 

0.896E+00 

ENGVALL 

0 

5 

3 

16 

13 

R 

0. 12AE-2A 

0.999E+00 

ENGVALL 

1 

3 

20 

19 

R 

0.525E-22 

0.999E+00 

ENGVALL 

2 

5 

3 

3A 

28 

R 

0. A1 7E-2A 

0.999E+00 

BRAI.IN 

0 

2 

2 

2 

2 

R 

0. 162E-28 

0.9A5E+00 

BRAN  IN 

1 

2 

2 

16 

15 

R 

0. lOOE-27 

0.868E+00 

BRANIN 

2 

2 

2 

15 

12 

R 

0.86AF.-32 

0.868E+00 

BEALE 

0 

3 

2 

10 

9 

R 

0.893E-26 

0.5A2E+00 

BEAI.E 

I 

3 

0 

6 

6 

R 

0.1A8E-21 

0.997E-t00 

GRAGG 

0 

5 

A 

22 

21 

R 

0.289E-18 

0.197E+00 

GRAGG 

1 

5 

A 

75 

A3 

C 

0.592E+02 

0.A15E-07 

BOX 

0 

10 

3 

21 

15 

R 

0. 208E-31 

0.958E+00 

BOX 

1 

10 

3 

19 

1 1 

C 

0. 378E-01 

0.385E-11 

BOX 

2 

10 

3 

23 

lA 

C 

0. 378E-01 

0. lOAE-07 

DAVl DON  1 

0 

15 

15 

9 

8 

c 

0. 710E-0A 

0.910E-07 

FRDSTEIN 

0 

2 

2 

9 

8 

c 

0.2A5E+02 

n.600E-07 

FRDSTEIN 

1 

2 

19 

12 

c 

0. 2A5F+C2 

0.25AE-08 

FRDSTEIN 

2 

2 

2 

35 

19 

c 

0.2A5E+02 

0.A52E-11 

WATSON6 

0 

31 

6 

1 1 

10 

c 

0. llAL-02 

0.822E-07 

WATSON 9 

0 

31 

9 

1  1 

9 

c 

0. 700E-06 

C.llAE-08 

WATSON  12 

0 

31 

12 

13 

12 

c 

0.236E-09 

0.902E-08 

WATSON20 

0 

31 

20 

10 

10 

c 

0. 1 52E-1 A 

0.62AE-07 

CHEBQD8 

0 

8 

8 

24 

18 

c 

0. I 76E-02 

0.3A2E-0e 

CHEBQD8 

1 

8 

8 

85 

61 

c 

0.1 76E-02 

0.586E-07 

BROWN 

0 

20 

A 

19 

16 

c 

0.A29E+05 

0. lOlE-07 

BROWN 

1 

20 

A 

22 

19 

G 

0. 429E+05 

0.731E-07 

BROWN 

2 

20 

A 

32 

26 

c 

0. A29E+05 

0.856E-08 

BARD 

0 

15 

3 

6 

6 

c 

0. A1 lE-02 

0.203E-07 

BARD 

1 

15 

3 

A2 

28 

c 

e. 871E+01 

0.858E-07 

BARD 

2 

15 

3 

2  1 

9 

G 

0.871E+01 

0.615E-07 

JENNRICH 

0 

10 

2 

!  5 

12 

C 

0. 62*.E+02 

0.212E-08 

KOWALIK 

0 

1 1 

A 

1 1 

10 

c 

0. 1 3AE-03 

0.729E-07 

KOWAI.IK 

1 

11 

A 

163 

90 

c 

0. 51AE-03 

0.712E-07 

KOWALIK 

2 

1 1 

A 

81 

61 

c 

0. 1 5AE-03 

0.563E-07 

OSBORNEl 

0 

33 

5 

23 

19 

c 

0. 273r-0A 

0.6A5E-08 

OSBORNE2 

0 

65 

1  1 

1  7 

16 

G 

0. 201E-01 

0. 183E-07 

OSBORNE2 

1 

65 

1  1 

28 

12 

C 

0.895E+00 

0. 121E-07 

MADSEN 

0 

3 

2 

1 1 

1  I 

C 

0. 387E+00 

0.975E-07 

MADSEN 

1 

3 

2 

12 

12 

G 

0. 387F+00 

0. 185E-07 

MADSEN 

2 

3 

2 

23 

22 

C 

0. 387E+00 

0.195E-07 

MEYER 

0 

16 

3 

228 

167 

c 

O.AAOE+02 

0.211E-10 
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Table  III 

Some  Nondefault  Test  Runs 


PROBLEM 

LS 

DEFAULT 
NF  NG  C 

D 

NF 

-  I 

NG 

C 

V(LMAXO) 

=  10**10 
NF  NG  C 

V(CCONCR) 

.  10**-8 

NF  NG  C 

NOTE 

ROSNBROK 

0 

15 

13 

R 

20 

16 

R 

15 

13 

R 

15 

13  R 

ROSNBROK 

1 

59 

44 

R 

36 

27 

R 

68 

54 

R 

59 

44  R 

ROSNBROK 

2 

156 

135 

R 

54 

45 

R 

178 

140 

R 

156 

135  R 

HELIX 

0 

15 

13 

R 

8 

8 

R 

15 

13 

R 

15 

13  R 

HELIX 

1 

12 

10 

R 

11 

9 

R 

13 

11 

R 

12 

10  R 

HELIX 

2 

125 

37 

X 

17 

14 

R 

20 

16 

R 

125 

37  X 

SINGULAR 

0 

18 

18 

R 

18 

18 

R 

18 

18 

R 

18 

18  R 

SINGULAR 

1 

22 

22 

R 

22 

22 

R 

22 

22 

R 

22 

22  R 

SINGULAR 

2 

30 

26 

R 

28 

26 

R 

25 

25 

R 

30 

26  R 

WOODS 

0 

56 

43 

R 

56 

42 

R 

56 

43 

R 

56 

43  R 

WOODS 

1 

64 

44 

R 

70 

46 

R 

67 

50 

R 

64 

44  R 

WOODS 

2 

75 

51 

R 

61 

47 

R 

67 

49 

R 

75 

51  R 

ZANGWILL 

0 

3 

3 

R 

3 

3 

R 

2 

2 

R 

3 

3  R 

ENGVALL 

0 

16 

13 

R 

16 

14 

R 

16 

13 

R 

16 

13  R 

ENGVALL 

1 

20 

19 

R 

20 

18 

R 

20 

19 

R 

20 

19  R 

ENGVALL 

2 

34 

28 

R 

28 

25 

C 

33 

24 

R 

34 

28  R 

1 

BRAN  IN 

0 

2 

2 

R 

2 

2 

R 

2 

2 

R 

2 

2  R 

- 

BRANIN 

1 

16 

15 

R 

17 

15 

R 

16 

15 

R 

16 

15  R 

BRANIN 

2 

15 

12 

R 

18 

16 

R 

22 

21 

R 

15 

12  R 

BEALE 

0 

10 

9 

R 

10 

8 

R 

10 

9 

R 

10 

9  R 

BEALE 

1 

6 

6 

R 

8 

8 

R 

6 

6 

R 

6 

6  R 

CRAGG 

0 

22 

21 

R 

21 

20 

R 

22 

21 

R 

22 

21  R 

CRAGG 

1 

75 

43 

C 

48 

43 

R 

46 

39 

C 

76 

44  C 

2 

BOX 

0 

21 

15 

R 

6 

6 

R 

6 

6 

R 

21 

15  R 

BOX 

1 

19 

11 

C 

27 

15 

C 

40 

21 

C 

19 

11  C 

BOX 

2 

23 

14 

C 

29 

16 

C 

6 

6 

C 

25 

15  C 

DAVIDONl 

0 

9 

8 

C 

3 

3 

R 

3 

3 

R 

10 

9  C 

FRDSTEIN 

0 

9 

8 

C 

9 

9 

C 

9 

8 

C 

10 

9  C 

FRDSTEIN 

1 

19 

12 

C 

19 

14 

C 

14 

13 

C 

19 

12  C 

FRDSTEIN 

2 

35 

19 

C 

30 

22 

C 

19 

17 

C 

35 

19  C 

WATSON6 

0 

11 

10 

c 

8 

8 

C 

11 

10 

C 

12 

11  C 

WATSON9 

0 

11 

9 

c 

11 

9 

C 

11 

9 

C 

11 

9  C 

WATSON 12 

0 

13 

12 

c 

13 

11 

C 

13 

12 

C 

13 

12  C 

WATSON20 

0 

10 

10 

c 

9 

9 

C 

10 

10 

C 

200 

96  F 

3 

CHEBQD8 

0 

24 

18 

c 

23 

17 

C 

24 

18 

C 

24 

18  C 

CHEBQD8 

1 

85 

61 

c 

74 

61 

C 

106 

66 

C 

100 

66  X 

3 

BROWN 

0 

19 

16 

c 

15 

14 

C 

19 

16 

C 

22 

17  C 

BROWN 

1 

22 

19 

c 

16 

16 

C 

20 

18 

C 

23 

••20  C 

BROWN 

2 

32 

26 

c 

26 

24 

C 

25 

23 

C 

32 

26  C 

BARD 

0 

6 

6 

c 

6 

6 

C 

6 

6 

c 

7 

7'C 

BARD 

1 

42 

28 

c 

47 

27 

c 

33 

24 

c 

54 

3ft  C 

BARD 

2 

21 

9 

c 

29 

17 

c 

8 

7 

c 

30 

15  C 

4 

JENNRICH 

0 

15 

12 

c 

15 

12 

c 

15 

12 

c 

15 

12  C 

KOWALIK 

0 

11 

10 

c 

16 

12 

c 

11 

10 

c 

12 

11  C 

KOWALIK 

1 

163 

90 

c 

200 

72 

F 

163 

90 

c 

172 

96  C 

KOWALIK 

2 

81 

61 

c 

96 

67 

C 

79 

62 

c 

82 

62  C 

USBORNEl 

0 

23 

19 

c 

30 

26 

C 

23 

19 

c 

23 

19  C 

OSBORNE2 

0 

17 

16 

c 

17 

14 

C 

17 

16 

c 

18 

17  C 

OSBORNE2 

1 

28 

12 

c 

24 

13 

c 

28 

12 

c 

30 

13  C 

MADSEN 

0 

11 

11 

c 

11 

11 

c 

1 1 

11 

c 

12 

12  C 

MADSEN 

1 

12 

12 

c 

13 

13 

c 

12 

12 

c 

13 

13  C 

MADSEN 

2 

23 

22 

c 

20 

19 

c 

23 

22 

c 

24 

23  C 

MEYER 

0 

228 

167 

c 

350 

198 

F 

315 

202 

c 

228 

167  C 
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Notcs  on  Table  III. 

Note  1:  For  problem  ENGVALL  vith  LS  =  2,  a  local  minimi zer  x*  having 
f(x*)  =  56.1  vas  found  in  the  D  =  1  run. 

Note  2;  For  problem  CRAGG  vith  LS  =  1,  a  different  local  ninimizer  x*, 
one  having  f(x*)  =  21j9,  vas  found  in  the  V(LKAX0)  =  10««10  run  than  in  the 
DEFAULT  run. 

Note  3:  For  problems  WATS0N20,  CHEBQD8,  and  BEOVW,  a  cosine  convergence 

-8 

tolerance  of  10  appears  too  tight  for  the  double-precision  arithmetic  of  an 
IBM  370  computer.  X-convergence  did  not  occur  on  WATSCN20  because  one  of  the 
X  components  hovered  about  zero.  The  mn  vith  V(CCONCR)  =  10*»-8  achieved 
f(x)  «=  6.U6  X  lo~  on  this  problem  (and  had  f(x)  =  6.53  x  lo"  after  20 
function  and  16  gradient  evaluations). 

Note  I*:  For  problem  BARD  vith  LS  =  2,  the  D  =  I  run  found  the  solution 
obtained  in  the  DEFAULT  run  vith  LS  =  0. 

Table  IV  summarizes  test  runs  vith  three  variants  of  NL2S0L,  all  of  vhich 
used  the  default  choice  of  D^^  end  the  same  IV  and  V  inputs  as  vere  used  for 
Table  II.  The  columns  headed  PURE  GN  shov  vhat  happens  if  the  augmented  model 
is  never  used,  vhile  those  headed  PURE  S  shov  vhat  happens  if  it  is  alvays  used 
(after  the  first  iteration).  Finally,  the  columns  headed  NO  SIZING  give  the 
results  obtained  vhen  adaptive  modelling  is  alloved  but  no  sizing  is  performed. 
Ve  feel  that  Table  IV  makes  a  good  case  for  the  use  of  adaptive  modelling  vith 
slslng  in  NL2S0L. 


DEFAULT 

PURE  GN 

PURE  S 

> 

NO  SIZING 

PROBLEM 

LS 

NF 

NG 

C 

NF 

NG 

C 

NF 

NG 

C 

NF 

NG 

C 

NOTE 

ROSNBROK 

0 

15 

13 

R 

20 

16 

R 

24 

21 

R 

20 

16 

R 

ROSNBROK 

1 

59 

44 

R 

38 

33 

R 

65 

55 

R 

58 

45 

R 

ROSNBROK 

2 

156 

135 

R 

113 

103 

R 

200 

45 

F 

122 

111 

R 

HELIX 

0 

15 

13 

R 

10 

10 

R 

22 

19 

R 

10 

10 

R 

HELIX 

1 

12 

10 

R 

12 

11 

R 

25 

16 

R 

13 

10 

R 

HELIX 

2 

125 

37 

X 

16 

14 

R 

38 

23 

R 

140 

39 

X 

SINGULAR 

0 

18 

18 

R 

18 

18 

R 

29 

29 

R 

18 

18 

R 

SINGULAR 

1 

22 

22 

R 

22 

22 

R 

35 

35 

R 

22 

22 

R 

SINGULAR 

2 

30 

26 

R 

30 

26 

R 

43 

42 

R 

30 

26 

R 

WOODS 

0 

56 

43 

R 

77 

67 

R 

47 

34 

R 

80 

50 

R 

WOODS 

1 

64 

44 

R 

80 

65 

R 

47 

40 

R 

116 

67 

R 

WOODS 

2 

75 

51 

R 

74 

55 

R 

62 

47 

R 

80 

58 

R 

ZANGWILL 

0 

3 

3 

R 

3 

3 

R 

3 

3 

R 

3 

3 

R 

ENGVALL 

0 

16 

13 

R 

15 

13 

R 

19 

17 

R 

16 

13 

R 

i;ngvall 

1 

20 

19 

R 

14 

14 

R 

25 

22 

R 

18 

17 

R 

ENGVALL 

2 

34 

28 

R 

28 

27 

R 

38 

36 

R 

27 

26 

R 

BRAN  IN 

0 

2 

2 

R 

2 

2 

R 

2 

2 

R 

2 

2 

R 

BRAN IN 

1 

16 

15 

R 

16 

15 

R 

24 

23 

R 

17 

15 

R 

BRANIN 

2 

15 

12 

R 

15 

12 

R 

38 

35 

R 

15 

12 

R 

BEALE 

0 

10 

9 

R 

10 

9 

R 

19 

14 

R 

10 

9 

R 

BEAI,E 

1 

6 

6 

R 

6 

6 

R 

14 

13 

R 

6 

6 

R 

CRAGG 

0 

22 

21 

f 

21 

20 

R 

38 

35 

R 

22 

21 

R 

GRAGG 

1 

75 

43 

L 

153 

100 

C 

200 

120 

C 

200 

99 

F 

1 

BOX 

0 

21 

15 

R 

22 

16 

R 

46 

26 

R 

19 

14 

R 

BOX 

1 

19 

11 

C 

20 

12 

C 

61 

52 

R 

20 

12 

C 

BOX 

2 

23 

14 

C 

25 

16 

r 

38 

24 

C 

25 

16 

C 

$ 

DAVIDONi 

0 

9 

8 

C 

9 

8 

C 

9 

8 

C 

9 

8 

c 

s 

FRDSTEIN 

0 

9 

8 

C 

26 

15 

C 

8 

8 

C 

9 

7 

c 

FRDSTEIN 

1 

19 

12 

C 

37 

22 

C 

24 

18 

c 

22 

14 

c 

^  O 

'’RDSTEIN 

2 

35 

19 

C 

46 

22 

C 

44 

29 

c 

38 

21 

c 

P  51 

OAT SON 6 
WATSON9 

0 

0 

11 

11 

10 

9 

C 

C 

13 

11 

12 

9 

C 

C 

15 

21 

11 

14 

c 

c 

12 

12 

11 

10 

c 

c 

WATSON  12 

0 

13 

12 

C 

13 

12 

C 

21 

17 

c 

14 

11 

c 

WATSON20 

0 

10 

10 

C 

10 

10 

C 

16 

15 

c 

10 

10 

c 

CHEBQD8 

0 

24 

18 

C 

43 

29 

c 

22 

18 

c 

22 

16 

c 

CHEBQD8 

1 

85 

61 

C 

118 

78 

c 

131 

90 

c 

147 

98 

c 

w 

BROWN 

0 

19 

16 

C 

128 

84 

c 

19 

17 

c 

24 

19 

c 

S  r. 

BROWN 

1 

22 

19 

C 

153 

89 

c 

24 

23 

c 

141 

78 

c 

2 

BROWN 

2 

32 

26 

C 

82 

59 

c 

33 

28 

c 

200 

107 

F 

m  ^ 

M 

BARD 

0 

6 

6 

C 

6 

6 

c 

10 

10 

c 

6 

6 

c 

f. 

BARD 

1 

42 

28 

c 

42 

28 

c 

76 

34 

c 

43 

29 

c 

2 

BARD 

2 

21 

9 

c 

21 

9 

c 

83 

26 

c 

21 

9 

c 

2 

JENNRICH 

0 

15 

12 

c 

22 

12 

c 

13 

12 

c 

15 

13 

c 

P.t  'W/VLIK 

0 

11 

10 

c 

26 

25 

c 

20 

15 

c 

12 

11 

c 

KOWALIK 

1 

163 

90 

c 

108 

78 

c 

200 

63 

F 

99 

73 

c 

LOW ALIK 

2 

81 

61 

c 

109 

85 

G 

200 

76 

F 

184 

151 

I 

I'SbORNKl 

0 

23 

19 

0 

17 

15 

c 

34 

31 

c 

17 

15 

c 

OSii()RNE2 

0 

1  7 

16 

c 

17 

15 

c 

16 

15 

c 

17 

16 

c 

wLBORNEB 

1 

28 

12 

c 

14 

12 

c 

18 

10 

c 

30 

14 

c 

MADSEN 

u 

1 1 

1 1 

c 

45 

45 

c 

12 

12 

c 

13 

13 

c 

EN 

1 

12 

12 

c 

42 

42 

c 

16 

16 

c 

14 

14 

c 

MADSEN 

2 

23 

22 

c 

56 

55 

c 

23 

23 

c 

30 

26 

c 

MI.VER 

n 

228 

1^-7 

c 

282 

183 

c 

32) 

183 

c 

189 

138 

c 
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Notes  on  Table  IV 

Note  1:  Each  of  the  runs  listed  for  problem  CRAGG  vith  LS  =  1  found 
a  different  local  nininizer  x*.  The  DEFAULT  run  found  fCx*)  =  59*2;  the 
PURE  GII  run  found  f(x*)  =  9.98  x  10**;  the  PURE  S  run  found  f(x*)  =  J*.37; 
and  the  NO  SIZING  run  found  f(x»)  =  1.11  x  10**. 

Note  2:  While  the  other  runs  of  problem  BARD  found  the  sane  local 
nininizer  as  the  corresponding  DEFAULT  run,  the  PURE  S  runs  gave  different 
resTilts.  For  LS  =  1,  the  PURS  S  rvui  found  fix*)  =  l*.ll  x  10  (as  did 
the  DEFAULT  run  vith  LS  =  O) ,  and  for  LS  =  2,  it  found  f(x*)  =  8.1*5. 
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1.  Purpose 

Given  a  continuously  differentiable  function  (residual  vector) 

T  ■  T 

R(x)  =  (Rj(x),  R2(x),  Rjj(x))  of  p  parameters  x  -  (Xj,  X2,  ....  x^)  , 

NL2S0L  attempts  to  find  a  parameter  vector  x*  which  minimizes  the  sum-of- 

squares  function  F(x)  = 

2.  Method 


Reference  1  explains  the  algorithm  realized  by  NL2SOL  in  detail.  The 
algorithm  amounts  to  a  variation  on  Newton's  method  in  which  part 

of  the  Hessian  matrix  is  computed  exactly  and  part  is  approximated  by  a 
secant  (quasi-Newton)  updating  method.  Once  the  iterates  come  sufficiently 
close  to  a  (local)  solution,  they  usually  converge  quite  rapidly.  To  pro¬ 
mote  convergence  from  poor  starting  guesses,  NL2S0L  uses  a  mudel/trust- 
region  technique  along  with  an  adaptive  choice  of  the  model  Hessian.  Con¬ 
sequently,  Che  algorithm  sometimes  reduces  to  a  Gauss-Newton  or  Levenberg- 
Marquardt  method.  On  large-residual  problems  (in  which  F(x*)  is  large), 
however,  NL2S0L  often  works  much  better  than  these  methods. 

3.  Calling  Sequence 


CALL  NL2SOL(N,  P,  X,  CALCR,  CALCJ,  IV,  V,  UIPARM,  URPARM,  UFPARM) 
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Note:  In  the  double-precision  version  of  NL2S0L,  all  quantities  termed 
REAL  below  are  actually  DOUBLE  PRECISION. 

N  (input  INTEGER)  is  the  number  of  components  in  the  residual  vector  R. 

P  (input  INTEGER)  is  the  number  of  parameters  on  which  R  depends. 

X  (1/0  REAL  array  of  length  P)  on  input  is  an  initial  guess  at  the 
desired  solution  x*.  l^hen  NL2S0L  returns  after  converging  or 
reaching  the  iteration  limit  (i.e.,  returns  with  IV(1)  =  3,  A,  5, 

6,  or  8) ,  X  contains  the  best  parameter  estimate  found. 

CALCR  (input  subroutine)  computes  the  residual  vector  R  «  R(X)  .  when 
invoked  by : 

CALL  CALCR(N,  P,  X,  NF,  R,  UIPARM,  URPARM,  UFPARM) 

When  CALCR  is  called,  NF  is  the  invocation  count  for  CALCR;  it  is 
included  for  possible  use  with  CALCJ.  If  X  is  out  of  bounds  (e.g. 
if  R(X)  would  overflow),  then  CALCR  should  set  NF  to  0,  which  will 
cause  a  shorter  step  to  be  attempted.  CALCR  should  not  change  N,  P, 
or  X  and  should  be  declared  EXTERNAL  in  the  calling  program.  R 
should  be  declared  REAL  R(N)  . 

CALCJ  (input  subroutine)  computes  the  Jacobian  matrix  J  =  J (X)  of  first 

3R^ 

partials,  J..  “  -5 — (X),  when  invoked  by: 

Ij  dXj 

CALL  CALCJ(N,  P,  X,  NF,  J,  UIPARM,  URPARM,  UFPARM) 

When  CALCJ  is  called,  NF  is  the  invocation  count  for  CALCR  at  the 
time  when  R(X)  was  evaluated.  Except  when  J  is  restored  after  a 
covariance  matrix  has  been  computed  with  IV(COVREQ)  *  1  or  2  (see 
§6) ,  the  X  passed  to  CALCJ  is  the  one  passed  to  CALCR  on  either 
its  most  recent  invocation  or  the  one  prior  to  it.  Thus  if  CALCR 
saves  Intermediate  results  for  use  by  CALCJ,  then  it  is  possible 
to  tell  from  NF  whether  they  are  valid  for  the  current  X  (or  which 
copy  is  valid  if  two  are  kept) .  If  J  cannot  be  computed  at  X,  then 
CALCJ  should  set  NF  to  0.  CALCJ  should  not  change  N,  P,  or  X  and 
should  be  declared  EXTERNAL  in  the  calling  program.  J  should  be 
declared  REAL  J(N,P)  . 

IV  (I/O  INTEGER  array  of  length  P  +  60)  on  input  contains  certain 
values  (such  as  limits  on  the  number  of  Iterations  and  calls  on 
CALCR)  that  control  the  behavior  of  NL2S0L  and  on  output  contains 
various  counts  and  other  items  of  Interest:  see  §§5  and  6.  If 
IV (1)  «  0  on  input,  then  default  values  are  supplied  for  the  input 
components  of  both  IV  and  V.  The  caller  may  supply  nondefault 
values  for  selected  components  of  IV  and  V  by  CALLing  DFAULT(IV,  V) 
and  then  assigning  the  appropriate  nondefault  values  before  calling 
NL2S0L. 

V  (I/O  REAL  array  of  length  96  +  N*(P+3)  +  P*(7P  +  43)/2)  on  input 
contains  certain  values  (such  as  convergence  tolerances)  that 
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control  the  behavior  of  NL2S0L  and  on  output  contains  various  items 
of  interest  (such  as  F(X)  and  R(X)):  see  §§7  and  17. 

UIPARM  (INTEGER  array  of  length  determined  by  the  caller)  is  passed  without 
change  to  CALCR  and  CALCJ  and  may  be  used  by  them  in  any  way  that 
the  caller  may  find  convenient. 

URPARM  (REAL  array  of  length  determined  by  the  caller),  like  UIPARM,  is 
passed  without  change  to  CALCR  and  CALCJ. 


UFPARM  (subroutine) ,  like  UIPARM,  is  passed  without  change  to  CALCR  and 
CALCJ.  If  there  is  no  need  for  such  a  subroutine,  then  on  many 
systems  it  suffices  to  pass  an  arbitrary  variable  or  constant  for 
UFPARM.  But  if  an  actual  subroutine  is  passed,  then  it  must  be 
declared  EXTERNAL  in  the  calling  program. 


4. 


Example 

Let  n  =  3,  p  =  2,  and  R(x) 


2  . 

+  X 

sin  Xj^ 
cos  X2 


2 

2 


+ 


X1X2 


(This  problem  is 


due  to  Madsen,  Reference  3  .)  The  following  FORTRAN  code  minimizes 
F(x)  =  |r(x)\(x)  ,  starting  from  the  initial  guess  (3,  1)^,  using  a  single¬ 
precision  version  of  NL2S0L. 


INTEGER  IV (62) 

REAL  V(168),  X(2) 

EXTERNAL  MADR,  MADJ 
X(l)  *  3.0 
X(2)  >=  1.0 
IV  (1)  =  0 

CALL  NL2SOL(3,  2,  X,  MADR,  MADJ,  IV,  V,  0,  0.,  MADR) 

STOP 

END 

SUBROUTINE  MADR(N,  P,  X,  NF,  R,  UIPARM,  URPARM,  UFPARM) 
INTEGER  N,  P,  NF,  UIPARM(l) 

REAL  X(P),  R(N),  URPARM(l) 

EXTERNAL  UFPARM 

R(l)  =  X(l)**2  +  X(2)**2  +  X(1)*X(2) 

R(2)  =  S1N(X(1)) 

R(3)  =  C0S(X(2)) 

RETURN 

END 

SUBROUTINE  MADJ(N,  P,  X,  NF,  J,  UIPARM,  URPARM,  UFPARM) 
INTEGER  N,  P,  NF,  UIPARM(l) 

REAL  X(P),  J(N,P),  URPARM(l) 

EXTERNAL  UFPAR.'t 
J(l,l)  »  2.0*X(1)  +  X(2) 

J(l,2)  =  2.0*X(2)  +  X(l) 

J(2,l)  «=  C0S(X(1)) 
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J(2,2)  “  0.0 
J(3,l)  =  0.0 
J(3,2)  -  -SIN(X(2)) 

RETURN 

END 

The  main  program  above  passes  MADR  as  CALCR  and  MADJ  as  CALCJ.  Since 
no  use  is  mide  of  UIPARM,  URPARJI,  or  UFPARM,  zeroes  are  passed  for  UIPARM 
and  URPARM  rnd  >L\DR  is  passed  for  UFPARM. 

V?hen  Che  above  is  executed,  NL2S0L  prints  the  initial  X  vector,  a 
summary  of  the  iterations  performed,  the  final  X  vector,  and  some  statistics 
(including  the  final  F(X)  and  a  covariance  matrix).  If  REAL  is  changed  to 
DOUBLE  PRECISION  and  the  above  is  run  on  an  IBM  370  computer,  then  NL2S0L 
reports  variability  convergence  (IV(1)  =6  —  see  §5)  after  7  calls  on 
CALCR  and  CALCJ  and  returns  X(l)  *  -0.156234,  X(2)  =  0.698698,  and 
F(X)  =  0.386616  . 

In  this  example,  it  is  possible  to  obtain  a  slightly  smaller  value 
of  F(X)  by  decreasing  the  variability  convergence  tolerance  from  its  default 
-4 

value  of  10  .  If  the  statement  IV(1)  =  0  in  the  main  program  is  replaced  by 

CALL  DFAULTdV,  V) 

V(42)  «  0.0 

then  variability  convergence  testing  is  turned  off.  When  this  modified 
version  of  the  example  is  run  on  an  IBM  370  with  REAL  changed  to  DOUBLE 
PRECISION,  NL2S0L  reports  cosine  convergence  (IV(1)  =  4)  after  11  calls 
on  CALCR  and  CALCJ  and  returns  X(l)  *  -0.155437,  X(2)  =  0.694564,  and 
F(X)  *=  0.386600  . 

5.  Return  Codes 

When  NL2S0L  returns,  IV(1)  contains  one  of  the  following  return  codes: 

3  “  X  convergence:  see  V(XCONCR)  in  §7. 

•  4  =  cosine  convergence:  see  V(CCONCR)  in  §7. 

5  *  residual  convergence:  see  V(RCONCR)  in  §7. 

6  «  variability  convergence:  see  V(VCONCR)  in  §7. 

7  ■<  function  evaluation  limit  reached:  see  IV(MXFCAL)  in  §6. 

8  *  iteration  limit  reached:  see  IV(MXITER)  in  §6. 

9  *  STOPX  returned  .TRUE,  (external  interupt) :  see  §16. 

11  ■  F(X)  overflows  at  the  initial  X. 

12  B  bad  parameters  passed  to  ASSESS  (which  should  not  occur). 

13  B  J(X)  could  not  be  computed  (i.e.,  CALCJ  set  NF  to  0). 

14  «  one  of  the  inequalities  NNiN>Pil  is  violated.  (NN  is 

only  of  interest  to  those  who  exercise  the  reverse  communication 
option  —  see  §15.) 

15  *  NL2S0L  was  restarted  (see  §9)  with  NN,  N,  or  P  changed. 

16  *  IV(INITS)  is  out  of  range:  see  §6. 

17  ■  IV(1)  was  out  of  range  (i.e.,  was  negative  or  greater  than  10) 

when  NL2S0L  was  called. 

18  or  more  ■  V(IV(1))  is  out  of  range:  see  §S7  and  17. 

Just  before  NL2S0L  returns,  a  brief  description  of  the  return  code 
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Is  printed  (unless  all  printing  is  turned  off  by  IV(PRUN’IT)  =  0). 

6.  IV  Values 

IV  Input  Values  (Supplied  by  DFAITLT) 

IV(1) .  IV (1)  should  have  a  value  between  0  and  10  when  NL2S0L  is 

called.  0  and  10  both  mean  that  this  is  a  fresh  start;  0 
means  DFAULT(IV,  V)  should  be  invoked  to  supply  defavilt  values 
to  the  input  components  of  IV  and  V,  while  10  means  that  the 
caller  has  already  supplied  these  values.  IV (1)  input  values 
between  3  and  9  mean  that  NL2S0L  should  restart:  see  §9. 

Default  =  10. 

IV (COVPRT) . . .  IV (14)  =  1  means  print  a  covariance  matrix  at  the  solution. 

This  matrix  is  computed  as  IV(COVREQ)  dictates  just  before  a 
return  with  IV (1)  =  3,  4,  5,  or  6.  IV (COVPRT)  =  0  means  do  not 
print  a  covariance  matrix.  Default  =  1. 

t 

IV (COVREQ) . . .  IV(15)  ^  0  means  compute  a  covariance  matrix  just  before  a 
return  with  IV (1)  =  3,  4,  5,  or  6.  In  this  case,  an  approxi¬ 
mate  covariance  matrix  is  obtained  in  one  of  several  ways. 

Let  k  *  |lV(C0VREQ)|  and  let  O  =  2F(X)/max{l,N-P},  where 
2F(X)  is  the  residual  sum  of  squares.  If  k  ^  1  or  2,  then 
a  finite-difference  Hessian  approximation  H  is  obtained.  If 
H  is  positive-definite  (or,  for  k  *  3,  if  the  Jacobian  matrix 
J  •  J(X)  is  nonsingular),  then  one  of  the  following  is  computed: 

-1  T  -1 

k  -  1  »>  o*H  (J  J)H 
k  -  2  «■> 

k  «  3  =>  a*(j’^J)~^  . 

If  IV(COVREQ)  >  0,  then  both  function  and  gradient  values  (calls 
on  CALCR  and  CALCJ)  are  used  in  computing  H  (with  step  sizes 
determined  by  V(DELTAO)  ~  see  §7),  while  if  IV(COVREQ)  <  0, 
then  only  function  values  (calls  on  CALCR)  are  used  (with  step 
sizes  determined  by  V(DLTFDC)).  If  IV(COVREQ)  =  0,  then  no 
attempt  is  made  to  compute  a  covariance  matrix  (unless 
IV (COVPRT)  «=  1,  in  which  case  NL2S0L  assumes  IV  (COVREQ)  =  1 
and  )IL2SN0  assumes  IV  (COVREQ)  =  -1).  Default  *=  1. 

IV (INITS) . . . .  IV(16)  tells  how  the  S  matrix  of  Ref.  1  should  be  initialized: 

0  means  set  S  to  0  and  start  with  the  Gauss-Mewton  model;  1  and 
2  mean  that  the  caller  has  supplied  the  initial  S,  storing  its 
lower  triangle  row-wise  in  V  starting  at  V(P  +  87);  IV(INITS)  =  1 
means  start  with  the  Gauss-Newton  model,  while  IV(INITS)  =  2 
means  start  with  the  augmented  model.  Default  ■  0. 

IV(HXFCAL) . . .  IV(17)  gives  the  maximum  number  of  function  evaluations  (calls 
on  CALCR,  excluding  those  used  to  compute  the  covariance  matrix) 
allowed.  If  this  number  does  not  suffice,  then  NL2S0L  returns 
with  IV(1)  -  7.  Default  «  200. 

IV(MXITER) . . .  1V(18)  gives  the  maximum  number  of  iterations  allowed.  It  also 
indirectly  limits  the  number  of  gradient  evaluations  (calls  on 
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CALCJ,  excluding  those  used  to  compute  the  covariance  matrix) 
to  IV^KITER)  +  1.  If  IV(MXITER)  iterations  do  not  suffice, 
then  NL2S0L  returns  with  IV(1)  =  8.  Default  =  150. 

IV(OUTLEV) . . .  IV(19)  controls  the  number  and  length  of  iteration  summary 
lines  printed.  IV(OUTLEV)  =  0  means  do  not  print  any  summary 
lines.  Otherwise,  print  a  summary  line  after  each  1 IV(OUTLEV) j 
Iterations.  Long  summary  lines  are  printed  if  IV(OUTLEV)  >  0, 
short  lines  if  IV(OUTLEV)  <  0.  See  §13  for  more  details. 

Default  =  1. 

IV(PARPRT) . . .  IV(20)  =  1  means  print  any  nondefault  V  values  on  a  fresh 
start  or  any  changed  V  (input)  values  on  a  restart. 

IV(PARPRT)  =  0  means  skip  this  printing.  Default  =  1. 

IV (PRUNIT) . . .  IV (21)  is  the  output  unit  number  on  which  all  printing  is 
done.  IV (PRUNIT)  =  0  means  suppress  all  printing.  (Setting 
IV (PRUNIT)  to  0  is  the  only  way  to  suppress  the  one-line 
termination  message  printed  before  NL2S0L  returns.) 

Default  =  standard  output  unit  (unit  6  on  most  systems);  the 
default  for  IV(PRUNIT)  is  actually  I^^DC0N(1):  see  §14. 

IV(SOLPRT) . . .  IV(22)  =  1  means  print  the  X  returned  (along  with  the 

corresponding  gradient  and  scale  vector  D) .  IV(SOLPRT)  =  0 
means  skip  this  printing.  Default  =  1. 

IV (STATPR) . . .  IV(23)  =  1  means  print  summary  statistics  upon  returning. 

•These  consist  of  the  function  value  (half  the  residual  sum  of 
squares)  at  X,  the  variability  of  the  last  step  (see  V(VCONCR) 

In  §7),  the  number  of  function  and  gradient  evaluations  (calls 
on  CALCR  and  CALCJ  respectively,  excluding  any  calls  used  in 
computing  the  covariance),  the  2-norm  of  the  gradient  at  X, 
the  corresponding  V(COSMAX)  (see  V(CCONCR)),  and  the  number  of 
calls  (if  positive)  on  CALCR  and  CALCJ  used  in  trying  to  compute 
covariance  matrices.  IV(STATPR)  =  0  means  skip  this  printing. 
Default  =  1. 

IV (XOPRT) . . . .  IV(24)  «  1  means  print  the  initial  X  and  scale  vector  D  if 

this  is  a  fresh  start.  IV (XOPRT)  =  0  means  skip  this  printing. 
Default  «  1. 

IV  Output  Values  of  Primary  Interest 

IV(I) . .  rv(l)  is  the  return  code:  see  S5. 

IV(COVMAT) . . .  IV(26)  tells  whether  a  covariance  matrix  was  computed.  If 

IV(C0V>L4T)  is  positive,  then  the  lower  triangle  of  the  covari¬ 
ance  matrix  is  stored  row-wise  in  V  starting  at  V(IV(COVMAT) ) . 

If  IV(COVMAT)  *  0,  then  no  attempt  was  made  to  compute  a  co- 
variance  matrix.  If  IV(COVMAT)  -  -1,  then  the  finite-difference 
Hessian  H  was  indefinite  (or,  for  |lV(C0VREQ)|  »  3,  the  current 
Jacobian  matrix  is  singular;  see  IV(COVREQ)  above).  And  if 
IV(COVMAT)  ■  -2,  then  a  successful  finite-difference  step  could 
not  be  found  for  some  component  of  X  (i.e.,  CALCR  set  NF  to  0 
for  each  of  two  trial  steps).  Note  that  IV(C0V>L4T)  is  reset  to 
0  after  each  successful  Iteration,  so  that  if  a  lower  function 
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value  is  found  after  a  restart,  then  a  new  attempt  will  be 
made  to  conputo  a  covariance  matrix. 


IV (D) .  IV(27)  is  the  starting  subscript  in  V  of  the  current  scale 

vector  D  (see  V(D0)  in  §7). 

IV(G) .  1V(2S)  is  the  starting  subscript  in  V  of  the  current  gradient 

T 

vector  g  =  J(X)  R(X). 


IV (NFCALL) . . .  IV(6)  is  the  number  of  calls  so  far  made  on  CALCR  (i.e., 
the  number  of  function  evaluations,  including  those  usod  in 
computing  covariance  matrices). 

IV(NFCOV) . . . .  IV (40)  is  the  number  of  calls  made  on  CALCR  when  computing 
covariance  matrices. 

IV(NGCALL) . . .  IV(43)  is  the  number  of  calls  so  far  made  on  CALCJ  (i.e., 
the  number  of  gradient  evaluations,  including  those  used  in 
computing  covariance  matrices) . 

IV(NGCOV) . . . .  IV(41)  is  the  number  of  calls  made  on  CALCJ  when  computing 
covariance  matrices. 

IV(NITER) . . . .  IV(44)  is  the  number  of  iterations  performed. 

IV(R) .  1V(50)  is  the  starting  subscript  in  V  of  the  residual 

vector  R(X) . 

7,  V  Values  of  Primary  Interest 


Many  of  the  V  input  components  described  here  and  in  §17  must  lie 
within  a  certain  range  of  values.  If  such  a  component  falls  outside  the 
range  indicated  below  (and  in  §17)  at  the  beginning  of  its  description, 
then  module  PARCHK  will  print  an  error  message  (unless  IV(PRUNIT)  =  0) 
and  will  force  NL2S0L  to  return  immediately  with  IV(1)  >18. 

Frequent  reference  is  made  below  to  two  quantities:  MACHEP  and  the 
scale  vector  D.  MACHEP  is  the  unit  roundoff  for  the  floating  point  arith¬ 
metic  being  used  —  see  §14.  The  scale  vector  D  is  the  diagonal  of  the 
(diagonal)  scale  matrix  discussed  in  §§5  and  7  of  [1];  this  scale 
matrix  is  denoted  by  diag(D)  below. 

V  Input  Values  of  Primary  Interest  (Supplied  by  DFAULT) 

V(CCONCR)...  V(29)  e  [0,  1]  is  the  cosine  convergence  tolerance.  Let  J 

th  ^ 

denote  the  i —  column  of  the  n  p  Jacobian  matrix  J  and  let 

cosmax(R,J)  =  max{  |j~R][  ♦]|j~]]['  *  ^  JTOL(i),  1  <  i  $  p}, 

‘2  ^2 

where  JTOL  is  described  with  V(JTINIT)  below.  If  NL2S0L  finds 
an  X  giving  cosmax(R(X) , J(X) )  ^  V(CCONCR),  then  it  returns 

with  1V(1)  =  4.  Default  =  max{lo”^,  1000‘MACHEP} . 

V(DE'.TAC).  .  V(31)  e  [MACHEP,  1]  helps  pick  the  finite-difference  steps 
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V(DFAC)... 
V(DINIT). . 
V(DLTFDC). 

V(DLTFDJ). 

V(DO) . 


V(JTINIT). 


used  in  conputing  H  when  IV(COVREQ)  =  1  or  2.  The  step  used 
for  component  X(i)  is 

V(DELTAO)  •  max{|X(i)|,  1/D(i)}  «  sign(X(i)), 
where  D  is  the  current  scale  vector,  (If  this  step  results 
in  CALCR  setting  NF  to  0,  then  -0.5  times  this  step  is  also 

tried.)  Default  =  MACHEP^^^^\ 

. .  V(32)  E  [0,  1]  and  V(D0)  are  used  in  updating  the  scale  vector 
D  —  see  V(D0)  below.  Default  =  0.6. 

. .  V(33),  if  nonnegative,  is  the  value  to  which  all  components 
of  the  scale  vector  D  are  initialized.  Default  =  0. 


. .  V(3A)  e  [>L\CHEP,  1]  helps  pick  the  step  sizes  used  in  com¬ 
puting  H  when  IV(COVREQ)  =  -1  or  -2.  For  differences  involv¬ 
ing  X(i),  the  step  first  tried  is 

V(DLTFDC)  •  niax{|x(i)|,  1/D(i)}. 

(If  this  step  is  too  large,  i.e.,  if  CALCR  sets  NF  to  0  when 
this  step  is  first  tried,  then  -0.5  times  this  step  is  also 

tried.)  Default  =  ILACHEP 

. .  V(35)  £  [MACHEP,  1]  helps  pick  the  step  sizes  that  NL2SN0 
uses  when  computing  its  finite-difference  approximation  to 
the  Jacobian  matrix  (see  §8).  For  differences  involving  X(i), 
the  step  first  tried  is  V(DLTFDJ)  •  max{]X(i)|,  1/D(i)}. 

(If  this  step  is  too  large,  i.e.,  if  CALCR  sets  NF  to  0, 
then  smaller  steps  are  tried  until  the  step  size  is  shrunk 

below  1000*MACHEP.)  Default  =  MACHEP^^^^^. 


V(36)  and  V(DFAC)  are  used  in  updating  the  scale  vector  D. 
If  V(D0)  >  0,  then  at  the  start  of  each  iteration,  D(i)  is 


set  to 
max 

where  J 


{[llJ.f  +  max{S..,0}]^^^,  V(DFAC)*D(i),  JTOL(i),  V(D0)}, 
i  2  il 

.  is  the  i —  column  of  the  current  Jacobian  matrix. 


S  is  the  S  matrix  of  [1],  and  JTOL  is  the  array  described  with 
V(JTINIT)  below.  If  -1  <  V(D0)  <  0,  then  D  is  set  to  the  above 
values  (after  any  initialization  due  to  V(DINIT))  on  the  first 
Iteration  and  is  not  changed  again.  If  V(D0)  =  0,  then  all 
components  of  D  are  set  to  1  (regardless  of  V(DINIT)),  which 
usually  gives  good  performance  on  well-scaled  problems.  If 
V(D0)<-1,  then  it  is  assumed  that  the  caller  has  chosen  D  and 
has  stored  it  in  V,  starting  at  V(96  +  2N  +  P(7P  +  41J/2). 

Default  =  10 


. .  V(38)  >  0.  For  1  $  i  $  P,  JTOL(i)  is  a  tolerance  used  to 
decide  whether  the  i-th  column  of  the  Jacobian  matrix  should 
be  considered  to  be  zero.  If  V(JTINIT)  >  0,  then  all  components 
of  the  JTOL  array  are  set  to  V(JTINIT),  and  if  V(JTINIT)  =  0, 
then  it  is  assumed  that  the  caller  has  stored  JTOL  in  V  start¬ 
ing  at  V(87).  Default  -  10“^. 

V(39)  >  0  gives  the  maximum  2-norm  allowed  for  the  very  first 


V(L>LAX0) . . . . 
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step  that  NL2S0L  attempts.  On  problems  where  this  step  would 
otherwise  be  inordinately  large,  it  is  very  useful  to  assign 
a  modest  value  to  V(DL\X0).  Default  =  100. 

V(RCONCR)...  V(40)  >  0  is  the  residual  convergence  tolerance.  If 

||R(X)||  <  V(RCON-CR),  then  NL2S0L  returns  with  IV(1)  =  5. 

2  -9 

Default  =  10 

V(VCONCR)...  V(42)  5  0  is  the  variability  convergence  tolerance.  If  H 
is  the  current  Hessian  approximation,  then  the  variability 
of  the  current  step  Ax  is 

V(VARIAB)  =  max{l,N-P}  •  Ax^HAx  /  (2F(X)), 
where  2F(X)  is  the  current  residual  sum  of  squares.  If 
V(VARIAB)  $  V(VCOXCR),  then  NL2S0L  returns  with  IV(1)  =  6. 

-4 

Default  =  10 

V(XCONCR)...  V(28)  ^  0  is  the  X  convergence  tolerance.  If  a  step  Ax  is 
tried  that  yields  a  much  smaller  function  decrease  than 
expected,  and  if  V(RELDX)  ^  V(XCONCR),  where  V(RELDX)  is 
the  maximum  relative  change  in  any  component  of  X  [which, 
for  X  =  X^  +  Ax,  is  computed  as 

lx(i)  -  X  (i)l 

V(ItELDX)  .  ^  :  I  S  1  S  PH. 

then  NL2S0L  returns  with  IV(1)  =  3.  Default  =  lOOO'MACHEP. 

V  Output  Values  of  Primary  Interest 

V(COSMAX)...  V(43)  =  cosmax(R(X),  J(X))  ~  see  V(CCONCR)  above. 

V(DGNORM)...  V(l)  =  II dlag(D) J(X)^R(X)|[  ,  where  D  is  the  current  scale 

T  ^ 

vector  and  J(X)  R(X)  *  VF(X). 

V(DSTNRM)...  V(2)  =  |1  diag(D)Ax||  ,  where  Ax  is  the  most  recently  tried 
step.  ^ 

V(F) . V(10)  =  F(X)  =  11R(X)11V  2. 

V(RELDX)...<  V(17)  is  the  maximum  relative  change  in  X  caused  by  the  most 
recent  step  —  see  V(XCONCR)  above. 

V(VARIAB)...  V(50)  is  the  variability  of  the  most  recent  step  —  see 
V(VCONCR)  above. 

8.  Finite-Difference  Jacobians  —  NL2SNO 

Those  who  do  not  wish  to  code  a  subroutine  CALCJ  for  (analytically) 
computing  the  Jacobian  matrix  may  avoid  doing  so  by  calling  NL2SX0  instead 
of  KL2SOL.  NL2SN0  computes  an  approximate  Jacobian  matrix  by  forward 
differences  (using  a  step  size  determined  by  V(DLTFDJ)  —  see  §7).  The 
calling  sequence  for  NL2SN0  amounts  to  the  one  for  NL2S0L  with  CALCJ  omitted 

CALL  NL2SN0(N,  P,  X,  CALCR,  IV,  V,  UIPARM,  URPARM,  UFPARM) 
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The  parameters  for  NL2SN’0  are  the  sane  as  the  corresponding  ones 
for  KL2S0I,,  One  minor  exception  occurs  with  the  handling  of  IV'(COVREQ): 
if  IV(COVPRT)  =  1  and  IV(COVKEQ)  =  0,  then  NL2SN0  sets  IV(COVREQ)  =  -1; 
otherwise  NL2SN-0  sets  IV(COVREQ)  to  -jlVCCOVREQ)  ]  .  Thus  RL2SNO  uses 
function  values  only  in  computing  covariance  matrices  and  V(DELTAO)  is 
not  used. 

9.  Restarting 

After  any  return  with  3  <  IV(1)  <  9,  it  is  possible  to  change  some 
of  the  IV  and  V  input  components  (such  as  the  convergence  tolerances  and 
the  iteration  and  function  evaluation  limits)  and  call  NL2S0L  (or  NL2SNO) 
again  with  IV(1)  unchanged.  This  causes  the  algorithm  to  be  resumed  at 
the  point  where  it  was  interrupted.  (It  is  even  possible  to  save  IV,  V, 
and  X  and  then  restart  in  a  subsequent  run.) 

10.  Scaling 

Problems  sometimes  arise  which  are  poorly  scaled  in  the  sense  that 
the  various  components  of  X  are  expressed  in  widely  differing  units.  With 
the  default  choice  of  the  scale  matrix  D  (see  V(D0)  and  the  beginning  of 
§7),  the  behavior  of  NL2S0L  is  largely  insensitive  to  this  kind  of  poor 
scaling.  On  well  scaled  problems,  its  performance  can  often  be  improved 
by  choosing  D  to  be  the  identity  matrix  (i.e.,  setting  V(D0)  =  0).  Some¬ 
times  it  may.  also  be  worthwhile  to  fix  D(i),  1  <  i  <  P,  at  the  2-norm 

of  the  i-th  column  of  the  Initial  Jacobian  matrix  (by  setting  V(L0)  =  -0.5). 

11.  LMAXQ;  Limiting  the  First  Step  Length 

On  some  problems  it  is  necessary  to  give  V(LMAXO)  =  V(39)  a> small 
value  to  prevent  a  disasterously  large  first  step,  one  which  might  lead 
to  exponent  overflow  or  arguments  out  of  range  to  intrinsic  functions. 

Even  if  no  disaster  occurs,  if  NL2SOL  takes  many  function  evaluations  on 
the  first  step,  then  performance  might  be  improved  by  a  much  smaller  (or 
in  some  cases  larger)  value  of  V(LMAXO). 

Note  that  if  Ax  Is  the  very  first  step  attempted,  then  ||Axj|  rather 

2 

than  II diag(D)AxjI  is  bounded  above  by  V(LMAXO),  because  it  was  felt  that 
2 

the  caller  might  have  a  better  idea  of  how  ||Ax||  should  be  limited.  As  a 
result,  V(LMAXO)  is  a  scale-sensitive  quantity. 

12.  Local  Solutions 


It  can  easily  happen  that  NL2S0L  only  finds  a  local  mlnimizer  of  the 
8um-of-squares  function  F(X)  and  that  a  different  starting  guess  would 
cause  a  point  to  be  found  at  which  F  has  a  still  smaller  value.  Except 
for  cases  where  special  conditions  (such  as  convexity  of  the  objective 
function)  prevail,  this  shortcoming  is  shared  by  all  minimization  algorithms. 

13.  Printed  Output 


Any  printing  is  done  by  one  of  two  modules:  ITS>[RY  and  PARCHK. 
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I’ARCHK  reports  any  V  input  components  that  are  out  of  range  and  optionally 
lists  any  such  components  that  have  nondcCault  or  changed  values  (on  a 
fresh  start  or  restart  respectively).  ITSMRY  docs  the  remaining  printing. 
V'arious  IV  input  components  control  v.’hat  printing  is  done  —  see  §6. 

If  IV(OUTLEV)  >  0,  then  ITSliRY  produces  an  iteration  summary  which 
includes  the  following  values;  IT,  the  current  iteration  number;  NF,  the 
number  of  function  evaluations  (calls  on  CALCK) ,  excluding  any  extra  ones 
needed  for  computing  covariance  matrices  and,  in  the  case  of  NL2SN0, 
excluding  the  extra  ones  needed  to  compute  finite-difference  Jacobian 
matrices;  F,  the  current  function  value  (half  the  residual  sum  of  squares); 
DF ,  the  difference  between  the  previous  and  current  function  values;  COSMAX, 
the  current  cosniax(R(X) ,  J(X))  —  see  V(CCOXCR)  in  §7;  VAR,  the  current 
step  variability  —  see  V(VCONCR)  in  §7;  MODEL,  which  tells  w’hich  models 
were  used  in  computing  the  current  step  [G  =  the  Gauss-Newton  model; 

S  =  the  augmented  model;  G-S  means  the  Gauss-Newton  model  was  tried  first 
and  a  switch  was  then  made  to  the  augmented  model;  S-G,  G-S-G,  and  S-G-S 
have  analogous  meanings];  L:\1IEDA,  the  current  Levenberg-Marquardt  param¬ 
eter  X  (which  is  nonnegative  if  the  step  Ax  just  taken  satisfies 

[H  +  X*diag(D)  ]Ax  =  -VF(X-Ax),  v:here  H  is  the  current  Hessian  approxi¬ 
mation,  and  is  negative  if  the  special  case  discussed  in  [2]  v?as  detected); 
RELDX,  the  current  value  of  V(RELDX)  —  see  V(XCONCR)  in  §7;  G,  the  current 

value  of  llVF(X)(l  =  |I J(X)'^R(X)|j  ;  SIZE,  the  sizing  factor  just  used  in 
2  2 

updating  the  S  matrix  (see  [1]);  and  D*STEP,  the  current  value  of  V(DSTNRM) 
—  see  §7.  These  summary’  lines  arc  118  characters  long  (including  the 
carriage  control  character).  If  IV(OUTLEV)  <  0,  then  lines  of  maximum 
length  69  (or  55  if  IV(COVPRT)  =  0)  are  generated,  and  the  iteration 
summary  includes  only  the  first  six  items  described  above  (i.e.,  IT,  NF,F, 
DF,  COSMAX,  and  VAR). 


14 .  Changing  Computers 

The  NL2S0L  distribution  tape  contains  both  single-  and  double-precision 
versions  of  the  NL2S0I.  source  code,  so  it  should  be  unnecessary  to  change 
precisions.  (On  computers  having  only  32  or  36  bits  per  REAL  word,  double 
precision  often  gives  better  performance.) 

Only  the  functions  IMDCON  and  RMDCON  contain  machine-dependent  constants 
To  change  from  one  computer  to  another,  it  should  suffice  to  change  the 
DATA  statements  in  these  functions.  The  D.ATA  statement  in  IMDCON  sets 
MDCO.N'(l)  to  the  output  unit  number  that  DFAULT  supplies  to  IV(PRUNIT). 

The  machine-dependent  DATA  statement  in  RMDCON  provides  three  values:  BIG, 
ETA,  and  MACHE? .  BIG  is  the  largest  floating-point  number  such  that  a 
FORTRAN  program  can  compute  SQRT(0.999*BIG)**2  [i.e.,  DSQRT(0.999D0*BIG)**2 
in  DOUBLE  PRECISION]  without  overf low’ing.  Similarly,  ET.A  is  the  smallest 
floating-point  number  such  that  SQRT (1 . 001*ETA) **2  [or  DSQRT(1.001D0*ETA)**2 
respectively]  does  not  underflow.  MACHEP  is  the  unit  roundoff,  i.e.,  the 
smallest  floating-point  number  such  that  1  +  MACHEP  yields  a  stored  floating 
point  number  greater  than  1  and  1  -  MACHEP  yields  a  stored  number  less  than 
1.  (So;:;-’  corpute.'s  feature  registers  that  carry  more  bits  than  can  be  stored 

■'AC  EP  should  only  reflect  the  accuracy  of  numbers  that  can  be  stored.) 
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Thc  test  progran  supplied  on  the  NL2S0L  distribution  tape  places  the 
further  restriction  on  BIG  and  ETA  that  EXP(1.999*ALOG(SQRT(0.999*BIG))) 
and  EXP ( 1 . 999*ALOG (SQRT ( 1 . 001*ETA) ) )  (DEXP ( 1 . 999De*nLOG (DSQRT (0 . 999D0*BIG) ) ) 
and  DEXP(1.999D0*DL0G(DSQRT(1.001D0*ETA)))  in  DOUBLE  PRECISION]  not  overflow 
or  underflow.  DATA  statements  giving  suitable  values  for  BIG,  ETA,  and 
MACHEP  for  a  variety  of  computers  appear  as  comments  in  RMDCON. 

Intrinsic  functions  are  explicitly  declared  in  the  NL2S0L  source  code. 
On  certain  computers  (e.g.  Univac) ,  it  may  be  necessary  to  comment  out 
these  declarations.  So  that  this  may  be  done  automatically  by  a  simple 
program,  such  declarations  are  preceded  by  a  comment  having  C/+  in  columns 
1-3  and  blanks  in  columns  4-72  and  are  followed  by  a  comment  having  C/ 

In  columns  1  and  2  and  blanks  in  columns  3-72. 

15.  Using  Reverse  Communication  —  NL2ITR 

Instead  of  writing  subroutines  CALCR  and  CALCJ  to  compute  the  residual 
vector  R(X)  and  Jacobian  matrix  J(X),  one  can  call  NL21TR  and  provide  R  and 
J  by  reverse  communication.  The  calling  sequence  is: 

CALL  NL2ITR(D,  IV,  J,  N,  NN,  P,  R,  V,  X) 

Parameters  IV,  N,  P,  V,  and  X  are  the  same  as  the  corresponding  parameters 
to  NL2S0L,  with  the  following  exceptions:  V  need  only  contain 
96  +  2N  +  P[7P  +  41]/2  elements,  since  the  storage  that  1IL2S0L  and  NL2SN0 
allocate  for  D,  J,  and  R  at  the  end  of  V  is  not  needed;  and  components 
IV (D),  IV(J),  and  IV(R)  are  not  referenced.  D  is  the  scale  vector  (dimen¬ 
sioned  D(P)).  NN  is  the  (integer)  lead  dimension  for  the  J  array,  which 
is  dimensioned  J(NN,P);  NN  must  satisfy  NN  5  N. 

When  NL2ITR  is  first  called  (with  IV(1)  =  0  or  10),  J  must  have  been 
set  to  J(X),  R  to  R(X) .  When  NL2ITR  wants  R  to  be  evaluated  at  a  new  X, 
it  returns  with  IV(1)  =  1;  the  caller  should  then  set  R  to  R(X)  (unless 
X  is  out  of  range,  in  which  case  the  caller  should  set  IV(TOOBIG),  i.e., 
IV(2),  to  1)  and  call  NL2ITR  again.  Similarly,  when  NL2ITR  wants  J  to 
be  evaluated  at  X,  it  returns  with  IV(1)  =  2,  and  the  caller  should  then 
set  J  to  J(X)  and  call  NL2ITR  again.  (If  J  cannot  be  evaluated  at  X,  the 
caller  may  set  IV(NFGCAL),  i.e.,  1V(7),  to  0;  this  will  cause  NL2ITR  to 
give  the  error  return  IV(1)  »  13.) 

16.  STOPX 

It  is  possible  to  arrange  for  NL2S0L  (NL2SN0  and  NL2ITR)  to  be  inter¬ 
rupted  when  used  in  an  interactive  environment.  To  do  this,  it  is  necessary 
to  replace  the  logical  function  STOPX  supplied  with  the  NL2S0L  package 
(which  always  returns  .FALSE.)  by  a  system-dependent  STOPX  that  returns 
•TRUE,  if  and  only  if  the  "break"  (i.e.,  "interrupt")  key  has  been  pressed 
since  the  last  call  on  STOPX.  Once  this  is  done,  NL2S0L  will  return  with 
1V(1)  =  9  when  the  "break"  key  is  pressed  before  some  other  return  has 
occurred.  It  is  then  possible  to  change  some  of  the  IV  and  V  input  com¬ 
ponents  and  restart  —  see  §9. 

17.  Other  V  Input  Values 
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V(COSMIN)...  V(30)  e  [MACIIEP,  1]  is  the  minimum  absolute  cosine  allowed 

between  the  step  just  taken,  :\x,  and  the  corresponding  change 
in  gradients,  Ag,  for  a  full  update  of  the  S  matrix  to  be 

performed.  If  |Ax^Agl/(|l  Ax]|  •  ||Ag|(  )  <  V(COSMIN),  then  Ax'^Ag 

2  2 

is  replaced  in  the  update  formula  by 

sign(Ax'^Ag) 'VCCOSMlN)  *1)  Axj|  *  JjAgj)  . 

*6  2  2 
Default  =  max { 10  ,  100*MACHEP}. 

V(DECFAC)...  V(21)  e  tO.Ol,  0.8]  is  the  factor  by  which  the  radius  of  the 
trust  region  is  shrunk  if  CALCR  sets  NF  to  0  (or  NIi2ITR  is 
called  with  IV (1)  =  1  and  IV(TOOBIG)  =  1).  Default  =0.5. 

V(EPSLON)...  V(18)  e  (0.001,  0.9]  is  the  maximum  relative  difference  al- 

T  X 

lowed  between  Vf(X)  Ax  +  ■jAx  kAx  and  its  optimal  value 

(subject  to  the  constraint  ||diag(D)Ax||  $  V (RADIUS) ) ,  where 

Ax  is  the  step  being  computed  and  H  is  the  current  Hessian 
approximation.  This  is  used  in  detecting  and  handling  the 
special  case  discussed  in  [2].  Default  =  0.1. 

V(PUZZ) .  V(37)  e  (1.01,  loo]  is  used  in  the  test  that  decides  whether 

to  switch  models.  If  q  is  the  current  model  for  F  (near 

n, 

■the  point  x)  and  q  is  the  other  model,  and  if 

V(FUZ2>*jq(X  +  Ax)  -  F(X  +  Ax)  |  <  |q{X  Ax)  -  F(X  +  Ax)  [ , 
then  the  models  are  switched.  Default  =  1.41421. 

V(INCFAC)...  V(22)  E  (1.2,  100]  is  the  factor  by  which  the  radius  of  the 
trust  region  is  increased  if  module  ASSESS  deems  this  worth¬ 
while  (or  the  gradient  tests  performed  in  NL2ITR  indicate 
that  this  should  be  done).  Default  =  2.0. 

V(PHMNFC)...  V(19)  e  (-0.99,  -O.OOl]  is  the  minimum  value  allowed  for 

[||diag(D)  Axil  -  V(RADIUS)  ]  A’CRADIUS)  .  Default  =  -0.1. 

•  2 

V(PHMXFC)...  V(20)  e  (1.2,  100]  is  the  maximum  value  allowed  for 

tlldiag(D)Axjl  -  V (RADIUS)  ]/V (RADIUS) .  Default  =  0.1. 

2 

V(RDFCMN),,.  V(23)  E  (0.01,  0.8]  is  the  minimum  factor  by  which  the  trust 
region  radius  V (RADIUS)  may  be  shrunk.  Default  =  0.1. 

VCRLIMIT) . . .  V(41)  £  [10^®,  BIG]  is  the  largest  value  allowed  for  11r{X)|1 

2 

before  F(X)  is  considered  to  overflow.  Default  =  V^O.999 ‘BIG, 
where  BIG  is  described  in  §14. 

V(TUNERl)...  V(24)  E  (0,  0.5] .  If  the  actual  function  reduction  is  no 
more  than  V(TUNERl)  times  its  predicted  value,  then  module 
ASSESS  will  consider  switching  models  or  decreasing  the  trust 
region  radius.  Default  =  o.l. 

...  V(25)  ^1.  If  the  actual  function  reduction  is  at  least 
V{TUNER2)  times  its  predicted  value,  then  ASSESS  decides  to 
increase  the  trust  region  radius.  Default  =  10. 


V(TUNER2) 


-40- 


V(TUNER3)...  V(26)  e  [0.001,  1].  If  F(X+Ax)  -  F  (X)  <  V(TU:iER3) ‘Ax^VF  (X)  , 
then  ASSESS  decides  to  increase  the  trust  region  radius. 

Default  =  0.75. 

V(TUNER4)...  V(27)  e  (0,  1].  If  ASSESS  accepts  the  step  Ax  without  other¬ 
wise  deciding  to  increase  or  decrease  the  trust  region  radius, 
and  if  either 

||VF(X+Ax)  -  VF(X)  -  HAx]!  <  V(TUNER4)  •||VF(X+ Ax)|| 

T  ^  T  .  ^  . 

or  Ax  7F(X+Ax)  <  V(TUNER3)*Ax  Vf(X),  then  the  radius  is 

increased.  Default  =  0.5. 


18.  Storage  Requirements 


The  NL2S0L  package  (excluding  test  program)  amounts  to  about  5000 
lines  of  FORTRAN  source  code,  which  includes  about  1900  executable  state¬ 
ments  and  around  40  FORMAT  statements.  When  compiled  on  an  IBM  370  by 
the  H-extended  ccxnpiler  with  OPT (2),  this  source  code  results  in  a  little 
under  54000  bytes  of  object  code.  The  amount  of  variable  storage  used  is 
listed  above  in  §3. 
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assessing  the  quality  of  the  step  thus  computed.  These  and  other  ideas  behind 
NL2S0L  are  discussed  eind  its  evolution  and  current  implementation  are  also 
described  briefly. 


